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Abstract 

In this thesis, we propose and analyze a multi-server model that captures a per- 
formance trade-off between centralized and distributed processing. In our model, a 
fraction p of an available resource is deployed in a centralized manner (e.g., to serve 
a most-loaded station) while the remaining fraction 1 - p is allocated to local servers 
that can only serve requests addressed specifically to their respective stations. 

Using a fluid model approach, we demonstrate a surprising phase transition in 
the steady-state delay, as p changes: in the limit of a large number of stations, and 
when any amount of centralization is available (p > 0), the average queue length 
in steady state scales as log^_ -^ when the traffic intensity A goes to 1. This is 

exponentially smaller than the usual M/M/1-queue delay scaling of j^, obtained 
when all resources are fully allocated to local stations (p = 0). This indicates a strong 
qualitative impact of even a small degree of centralization. 

We prove convergence to a fluid limit, and characterize both the transient and 
steady-state behavior of the finite system, in the limit as the number of stations N 
goes to infinity. We show that the sequence of queue-length processes converges to 
a unique fluid trajectory (over any finite time interval, as A^ ^ cx)), and that this 
fluid trajectory converges to a unique invariant state v^, for which a simple closed- 
form expression is obtained. We also show that the steady-state distribution of the 
A^-server system concentrates on v^ as A^ goes to infinity. 

Thesis Supervisor: John N. Tsitsiklis 

Title: Clarence J. Lebel Professor of Electrical Engineering 



Acknowledgments 



I would like to express my deepest gratitude to my thesis supervisor, Professor 
John N. Tsitsiklis, for his invaluable guidance and support over the last two years. 

I would like to thank Yuan Zhong (MIT) for his careful reading of an early draft 
of this thesis. 

I would like to thank Professor Devavrat Shah (MIT) and Professor Julien M. 
Hendrickx (Catholic University of Louvain) for helpful discussions on related sub- 
jects. 

I would like to thank my family for their love and constant support over the years. 

This research was supported in part by an MIT Jacobs Presidential Fellowship, 
a Xerox-MIT Fellowship, a Siebel Scholarship, and NSF grant CCF-0728554. 



Contents 



Introduction 9 

1.1 Distributed versus Centralized Processing 9 

1.1.1 Primary Motivation: Server Farm with Local and Central Servers [9 

1.1.2 Secondary Motivation: Partially Centralized Scheduling .... 10 

1.2 Overview of Main Contributions 12 

1.3 Related Work 13 

1.4 Organization of the Thesis [15 

Model and Notation ll6| 

2.1 Model 

2.2 System State 

2.3 Notation 

Summary of Main Results 

3.1 Definition of Fluid Model 

3.2 Analysis of the Fluid Model and Exponential Phase Transition .... 

3.3 Convergence to a Fluid Solution - Finite Horizon and Steady State . . 

Probability Space and Coupling 

4.1 Overview of Technical Approach 

4.2 Definition of Probability Space 

4.3 A Coupled Construction of Sample Paths 

Fluid Limits of Stochastic Sample Paths 

5.1 Tightness of Sample Paths over a Nice Set 

5.2 Derivatives of the Fluid Limits 

Properties of the Fluid Model 

6.1 Invariant State of the Fluid Model 



6.2 Uniqueness of Fluid Limits & Continuous Dependence on Initial Con- 
ditions 

6.2.1 v(-) versus s(-), and the Uniqueness of Fluid Limits 

6.3 Convergence to Fluid Solution over a Finite Horizon 

6.4 Convergence to the Invariant State v^ 

6.4.1 A Finite-support Property of v(-) and Its Implications 

7 Convergence of Steady-State Distributions 

7.1 Uniform Rate of Convergence to the Fluid Limit 

7.2 Proof of Theorem [7| 



A Appendix: Additional Proofs 

A.l Complete Proof of Proposition [TT] 
A. 2 Proof of Proposition! 



46 
42 
55 
55 
59 



62 
63 



8 Conclusions and Future Work 68 



B Appendix: Simulation Setup |82 



List of Figures 



1-1 Server farm with local and central servers 

1-2 Centralized scheduling with communication constraints. 



3-1 Values of s[, as a function of i, ioi p = and p = 0.05, with traffic 

intensity A = 0.99 [26 

3-2 Illustration of the exponential improvement in average queue length 

from 0(y3j) to 0(log j^) as A ^ 1, when we compare p = to p - 0.05. 27 

3-3 Relationships between convergence results l29 



4-1 Illustration of the partition of [0,1] for constructing V^(a;,-) |32 



6-1 Comparison between the V^[-] and S^[-] representations \M 



List of Tables 



3.1 Values of i*(p, A) for various combinations of (p, A) |27 



Chapter 1 
Introduction 

1.1 Distributed versus Centralized Processing 

The tension between distributed and centralized processing seems to have existed 
ever since the inception of computer networks. Distributed processing allows for 
simple implementation and robustness, while a centralized scheme guarantees opti- 
mal utilization of computing resources at the cost of implementation complexity and 
communication overhead. A natural question is how performance varies with the 
degree of centralization. Such an understanding is of great interest in the context of, 
for example, infrastructure planning (static) or task scheduling (dynamic) in large 
server farms or cloud clusters, which involve a trade-off between performance (e.g., 
delay) and cost (e.g., communication infrastructure, energy consumption, etc.). In 
this thesis, we address this problem by formulating and analyzing a multi-server 
model with an adjustable level of centralization. We begin by describing informally 
two motivating applications. 

1.1.1 Primary Motivation: Server Farm with Local and Cen- 
tral Servers 

Consider a server farm consisting of A^ stations, depicted in Figure 11-11 Each station 
is fed with an independent stream of tasks, arriving at a rate of A tasks per second, 
with < A < l|l| Each station is equipped with a local server with identical perfor- 
mance; the server is local in the sense that it only serves its own station. All stations 
are also connected to a single centralized server which will serve a station with the 



^Without loss of generality, we normalize so that the largest possible arrival rate is 1. 



longest queue whenever possible. 

We consider an A^-station system. The system designer is granted a total amount 
A^ of divisible computing resources (e.g., a collection of processors). In a loose sense 
(to be formally defined in Section 12. ip . this means that the system is capable of 
processing A^ tasks per second when fully loaded. The system designer is faced 
with the problem of allocating computing resources to local and central servers. 
Specifically, for some p e (0, 1), each of the A^ local servers is able to process tasks 
at a maximum rate of 1 - p tasks per second, while the centralized server, equipped 
with the remaining computing power, is capable of processing tasks at a maximum 
rate of pN tasks per second. The parameter p captures the amount of centralization 
in the system. Note that since the total arrival rate is AA^, with < A < 1, the system 
is underloaded for any value p e (0, 1). 

When the arrival processes and task processing times are random, there will be 
times when some stations are empty while others are loaded. Since a local server 
cannot help another station process tasks, the total computational resources will be 
better utilized if a larger fraction is allocated to the central server. However, a greater 
degree of centralization (corresponding to a larger value of p) entails more frequent 
communications and data transfers between the local stations and the central server, 
resulting in higher infrastructure and energy costs. 

How should the system designer choose the coefficient p? Alternatively, we can 
ask an even more fundamental question: is there any significant difference between 
having a small amount of centralization (a small but positive value of p), and complete 
decentralization (no central server and p = 0)? 

1.1.2 Secondary Motivation: Partially Centralized Schedul- 
ing 

Consider the system depicted in Figure 11-21 The arrival assumptions are the same 
as in Section ll.l.li However, there is no local server associated with a station; all 
stations are served by a single central server. Whenever the central server becomes 
free, it chooses a task to serve as follows. With probability p, it processes a task from 
a most loaded station, with an arbitrary tie-breaking rule. Otherwise, it processes 
a task from a station selected uniformly at random; if the randomly chosen station 
has an empty queue, the current round is in some sense "wasted" (to be formalized 
in Section [21]). 

This second interpretation is intended to model a scenario where resource alloca- 
tion decisions are made at a centralized location on a dynamic basis, but communi- 
cations between the decision maker (central server) and local stations are costly or 
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Figure 1-1: Server farm with local and central servers. 



Station 1 



nnnn 



Station 2 



Station A^ 



DDD 



Scheduler 



_ LQF~/J 
Random ~ 1-p 




N 



Figure 1-2: Centralized scheduling with communication constraints. 



simply unavailable from time to time. While it is intuitively obvious that longest- 
queue-first (LQF) scheduling is more desirable, up-to-date system state information 
(i.e., queue lengths at all stations) may not always be available to the central server. 
Thus, the central server may be forced to allocate service blindly. In this setting, 
a system designer is interested in setting the optimal frequency (p) at which global 
state information is collected, so as to balance performance and communication costs. 
As we will see in the sequel, the system dynamics in the two applications are 
captured by the same mathematical structure under appropriate stochastic assump- 
tions on task arrivals and processing times, and hence will be addressed jointly in 
this thesis. 
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1.2 Overview of Main Contributions 

We provide here an overview of the main contributions. Exact statements of the 
results will be provided in Chapter [3] after the necessary terminology has been intro- 
duced. 

Our goal is to study the performance implications of varying degrees of central- 
ization, as expressed by the coefficient p. To accomplish this, we use a so-called 
fluid approximation, whereby the queue length dynamics at the local stations are 
approximated, as A^ ^ cx3, by a deterministic fluid model, governed by a system of 
ordinary differential equations (ODEs). 

Fluid approximations typically involve results of two flavors: qualitative results 
derived from the fluid model that give insights into the performance of the original 
finite stochastic system, and technical convergence results (often mathematically in- 
volved) that justify the use of such approximations. We summarize our contributions 
along these two dimensions: 

1. On the qualitative end, we derive an exact expression for the invariant state 
of the fiuid model, for any given traffic intensity A and centralization coefficient 
p, thus characterizing the steady-state distribution of the queue lengths in the 
system as A^ ^ cxj. This enables a system designer to use any performance 
metric and analyze its sensitivity with respect to p. In particular, we show a 
surprising exponential phase transition in the scaling of average system delay 
as the load approaches capacity (A -^ 1) (Corollary [3] in Section [3.2p : when 
an arbitrarily small amount of centralized computation is applied {p> 0), the 
average queue length in the system scales a^l 

E(g)~iog^-^, (1.1) 

as the traffic intensity A approaches 1. This is drastically smaller than the 
Y^j scaling obtained if there is no centralization (p = 0)|^ This suggests that 
for large systems, even a small degree of centralization provides significant 
improvements in the system's delay performance, in the heavy traffic regime. 

2. On the technical end, we show that: 



^The ~ notation used in this thesis is to be understood as asymptotic closeness in the following 
sense: [/ (x) ~ g{x), as a: ^ 1] ^ linia;-.i ^ = 1- 

^When p = 0, the system degenerates into N independent queues. The -^ scaling comes from 
the mean queue length expression for M /AI/l queues. 
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(a) Given any finite initial queue sizes, and with high probabihty, the evo- 
lution of the queue length process can be approximated by the unique 
solution to a fluid model, over any finite time interval, as N ^ oo. 

(b) All solutions to the fluid model converge to a unique invariant state, as 
t ^ oo, for any finite initial condition (global stability). 

(c) The steady-state distribution of the finite system converges to the invari- 
ant state of the fluid model as A^ -> oo. 

The most notable technical challenge comes from the fact that the longest- 
queue-first policy used by the centralized server causes discontinuities in the 
drift in the fluid model (see Section [STT] for details). In particular, the classical 
approximation results for Markov processes (see, e.g., |2]), which rely on a 
Lipschitz-continuous drift in the fluid model, are hard to apply. Thus, in order 
to establish the finite- horizon approximation result (a), we employ a sample- 
path based approach: we prove tightness of sample paths of the queue length 
process and characterize their limit points. Establishing the convergence of 
steady-state distributions in (c) also becomes non-trivial due to the presence of 
discontinuous drifts. To derive this result, we will first establish the uniqueness 
of solutions to the fluid model and a uniform speed of convergence of stochastic 
sample paths to the solution of the fluid model over a compact set of initial 
conditions. 



1.3 Related Work 

To the best of our knowledge, the proposed model for the splitting of processing 
resources between distributed and central servers has not been studied before. How- 
ever, the fluid model approach used in this thesis is closely related to, and partially 
motivated by, the so-called supermarket model of randomized load-balancing. In 
that literature, it is shown that by routing tasks to the shorter queue among a small 
number {d > 2) of randomly chosen queues, the probability that a typical queue has at 

least i tasks (denoted by Sj) decays as A^^ (super-geometrically) as i -> oo ([3],|1]); 
see also the survey paper [8] and references therein. However, the approach used in 
load-balancing seems to offer little improvement when adapted to scheduling. In [5], 
a variant of the randomized load-balancing policy was applied to a scheduling setting 
with channel uncertainties, where the server always schedules a task from a longest 
queue among a finite number of randomly selected queues. It was observed that Sj 
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no longer exhibits super-geometric decay and only moderate performance gain can 
be harnessed from sampling more than one queue. 

In our setting, the system dynamics causing the exponential phase transition 
in the average queue length scaling are significantly different from those for the 
randomized load-balancing scenario. In particular, for any p> 0, the tail probabilities 
Sj become zero for sufficiently large finite i, which is significantly faster than the 
super-geometric decay in the supermarket model. 

On the technical side, arrivals and processing times used in supermarket models 
are often memoryless (Poisson or Bernoulli) and the drifts in the fiuid model are 
typically continuous with respect to the underlying system state. Hence convergence 
results can be established by invoking classical approximation results, based on the 
convergence of the generators of the associated Markov processes. An exception is 
[7], where the authors generalized the supermarket model to arrival and processing 
times with general distributions. Since the queue length process is no longer Markov, 
the authors rely on an asymptotic independence property of the limiting system and 
use tools from statistical physics to establish convergence. 

Our system remains Markov with respect to the queue lengths, but a significant 
technical difference from the supermarket model lies in the fact that the longest- 
queue-first service policy introduces discontinuities in the drifts. For this reason, we 
need to use a more elaborate set of techniques to establish the connection between 
stochastic sample paths and the fiuid model. Moreover, the presence of discontinu- 
ities in the drifts creates challenges even for proving the uniqueness of solutions for 
the deterministic fiuid model. (Such uniqueness is needed to establish convergence 
of steady-state distributions.) Our approach is based on a state representation that 
is different from the one used in the popular supermarket models, which turns out 
to be surprisingly more convenient to work with for establishing the uniqueness of 
solutions to the fluid model. 

Besides the queueing-theoretic literature, similar fluid model approaches have 
been used in many other contexts to study systems with large populations. Re- 
cent results in [6] establish convergence for flnite-dimensional symmetric dynamical 
systems with drift discontinuities, using a more probabilistic (as opposed to sample 
path) analysis, carried out in terms of certain conditional expectations. We believe 
that it is possible to prove our results using the methods in [6], with additional work. 
However, the coupling approach used in this thesis provides strong physical intuition 
on the system dynamics, and avoids the need for additional technicalities from the 
theory of multi- valued differential inclusions. 

Finally, there has been some work on the impact of service flexibilities in routing 
problems, motivated by applications such as multilingual call centers. These date 
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back to the seminal work of [9], with a more recent numerical study in [10]. These 
results show that the ability to route a portion of customers to a least-loaded station 
can lead to a constant-factor improvement in average delay under diffusion scaling. 
This line of work is very different from ours, but in a broader sense, both are trying 
to capture the notion that system performance in a random environment can benefit 
significantly from even a small amount of centralized coordination. 

1.4 Organization of the Thesis 

Chapter [2] introduces the precise model to be studied, our assumptions, and the 
notation to be used throughout. The main results are summarized in Chapter [3], 
where we also discuss their implications along with some numerical results. The 
remainder of the thesis is devoted to establishing the technical results, and the reader 
is referred to Section 14.11 for an overview of the proofs. The steps of two of the more 
technical proofs are outlined in the main text, while the complete proofs are relegated 
to Appendix |X1 The procedure and parameters used for numerical simulations are 
described in Appendix [Bl 
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Chapter 2 

Model and Notation 



This chapter covers the modehng assumptions, system state representations, and 
mathematical notation, which will be used throughout the thesis. We will try to 
provide the intuition behind our modeling choices and assumptions if possible. In 
some cases, we will point the reader to explanations that will appear later in the 
thesis, if the ideas involved are not immediately obvious at this stage. 

2.1 Model 

We present our model using terminology that corresponds to the server farm appli- 
cation in Section II. 1.11 Time is assumed to be continuous. 

1. System. The system consists of A^ parallel stations. Each station is associated 
with a queue which stores the tasks to be processed. The queue length (i.e., 
number of tasks) at station n at time t is denoted by Quit), n e {1,2, . . . ,N},t > 
0. For now, we do not make any assumptions on the queue lengths at time 
t = 0, other than that they are finite. 

2. Arrivals. Stations receive streams of incoming tasks according to independent 
Poisson processes with a common rate A e [0,1). 

3. Task Processing. We fix a centralization coefficient p e [0, 1]. 

(a) Local Servers. The local server at station n is modeled by an inde- 
pendent Poisson clock with rate 1- p (i.e., the times between two clock 
ticks are independent and exponentially distributed with mean ]— ). If 
the clock at station n ticks at time t, we say that a local service token 
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is generated at station n. If Qnif) "^ 0, exactly one task from station n 
"consumes" the service token and leaves the system immediately. Other- 
wise, the local service token is "wasted" and has no impact on the future 
evolution of the system. 

(b) Central Server. The central server is modeled by an independent Pois- 
son clock with rate Np. If the clock ticks at time t at the central server, 
we say that a central service token is generated. If the system is non- 
empty at t (i.e., Yjn=iQn{t) > 0), exactly one task from some station n, 
chosen uniformly at random out of the stations with a longest queue at 
time t, consumes the service token and leaves the system immediately. If 
the whole system is empty, the central service token is wasted. 

Physical interpretation of service tokens. We interpret Qnif) as the number 
of tasks whose service has not yet started. For example, if there are four tasks at 
station n, one being served and three that are waiting, then Qn{t) = 3. The use of 
local service tokens can be thought of as an approximation to a work- conserving^ 
server with exponential service time distribution in the following sense. Let tk be the 
A;th tick of the Poisson clock at the server associated with station n. If Qn{tk~) > 0|j 
the ticking of the clock can be thought of as the completion of a previous task, so 
that the server "fetches" a new task from the queue to process, hence decreasing the 
queue length by 1. Therefore, as long as the queue remains non-empty, the time 
between two adjacent clock ticks can be interpreted as the service time for a task. 
However, if the local queue is currently empty, i.e., Qnitk-) = 0, the our modeling 
assumption implies that the local server does nothing until the next clock tick at 
tfc+i, even if some task arrives during the period (tfc,ffc+i). Alternatively, this can be 
thought of as the server creating a "virtual task" whenever it sees an empty queue, 
and pretending to be serving the virtual task until the next clock tick. In contrast, 
a work- conserving server would start serving the next task immediately upon its 
arrival. We have chosen to use the service token setup, mainly because it simplifies 
analysis, and it can also be justified in the following ways. 

1. Because of the use of virtual tasks, one would expect the resulting queue length 
process under our setup to provide an upper bound on queue length process 
in the case of a work-conserving server. We do not formally prove such a 
dominance relation in this thesis, but note that a similar dominance result in 
GI/GI/n queues was proved recently (Proposition 1 of 



^A server is work-conserving if it is never idle when the queue is non-empty. 

^ Throughout the thesis, we use the short-hand notation f{t-) to denote the left limit lim^ij /(*)■ 
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2. Since the discrepancy between the two setups only occurs when the server sees 
an empty queue, one would also expect that the queue length processes under 
the two cases become close as traffic intensity A ^ 1, in which case the queue 
will be non-empty for most of the time. 

The same physical interpretation applies to the central service tokens. 

Mathematical equivalence between the two motivating applications. We 

note here that the scheduling application in Section 11.1.21 corresponds to the same 
mathematical model. The arrival statistics to the stations are obviously identical in 
both models. For task processing, note that we can equally imagine all service tokens 
as being generated from a single Poisson clock with rate N. Upon the generation 
of a service token, a coin is flipped to decide whether the token will be directed 
to process a task at a random station (corresponding to a local service token), or 
a station with a longest queue (corresponding to a central service token). Due to 
the Poisson splitting property, this produces identical statistics for the generation of 
local and central service tokens as described above. 



2.2 System State 

Let us fix A^. Since all events (arrivals of tasks and service tokens) are gener- 
ated according to independent Poisson processes, the queue length vector at time 
t, iQi{t),Q2{t), . . . ,QN{t)), is Markov. Moreover, the system is fully symmetric, 
in the sense that all queues have identical and independent statistics for the arrivals 
and local service tokens, and the assignment of central service tokens does not depend 
on the specific identity of stations besides their queue lengths. Hence we can use a 
Markov process {Sf^(f)}^^Q to describe the evolution of a system with A^ stations, 

where 

1 ^ 

Sf(0 = T7E%,oo)(QnW), ^>0. (2.1) 

Each coordinate S^ (t) represents the fraction of queues with at least i tasks. Note 
that S^(t) = 1 for all t and A^ according to this definition. We call S^ (t) the 
normalized queue length process. We also define the aggregate queue length 
process as 

Vf(t)^^Sf(t), z>0. (2.2) 



Note that 

Sf(t)=Vf(t)-ViI,(t). (2.3) 

In particular, this means that V^(t) - Vf^(t) = S^(t) = 1. Note also that 

oo 

Vf(t) = ESf(t) (2.4) 

is equal to the average queue length in the system at time t. When the total number 
of tasks in the system is finite (hence all coordinates of V^ are finite), there is a 
straightforward bijection between S^ and V^. Hence V^(t) is Markov and also 
serves as a valid representation of the system state. While the S^ representation 
admits a more intuitive interpretation as the "tail" probability of a typical station 
having at least i tasks, it turns out the V^ representation is significantly more con- 
venient to work with, especially in proving uniqueness of solutions to the associated 
fiuid model, and the detailed reasons will become clear in the sequel (see Section 
16.2.11 for an extensive discussion on this topic). For this reason, we will be working 
mostly with the V^ representation, but will in some places state results in terms of 
S^, if doing so provides a better physical intuition. 

2.3 Notation 

Let Z+ be the set of non-negative integers. The following sets will be used throughout 
the thesis (where M is a positive integer): 

S = {se [0, If^ : 1 = So > si > ••• > 0} , (2.5) 

S^ = LeS:Y,s,<M\, S°° = LeS:Y,s,<oo\ (2.6) 

V = -^ V : Vj = ^ Sj , for some s e 5 > , (2.7) 

V = j V : Vj = ^ Sj, for some s e S > , (2.8) 
|x e M^+ : X, = — , for some /sT € Z+, Vij . (2.9) 
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We define the weighted L2 norm || • ||^ on M^+ as 

l|x-y£ = f:^^^, x,yeM^^ (2.10) 

In general, we will be using bold letters to denote vectors and ordinary letters for 
scalars, with the exception that a bold letter with a subscript (e.g., Vj) is understood 
as a (scalar- valued) component of a vector. Upper-case letters are generally reserved 
for random variables (e.g., V('''^)) or scholastic processes (e.g., V^(t)), and lower- 
case letters are used for constants (e.g., v°) and deterministic functions (e.g., v(t)). 
Finally, a function is in general denoted by x{-), but is sometimes written as x{t) to 
emphasize the type of its argument. 
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Chapter 3 

Summary of Main Results 



In this chapter, we provide the exact statements of our main results. The main 
approach of our work is to first derive key performance guarantees using a simpler 
fluid model, and then apply probabilistic arguments (e.g.. Functional Laws of Large 
Numbers) to formally justify that such guarantees also carry over to sufficiently 
large finite stochastic systems. Section 13.11 gives a formal definition of the core fiuid 
model used in this thesis, along with its physical interpretations. Section [3^ contains 
results that are derived by analyzing the dynamics of the fiuid model, and Section 
13.31 contains the more technical convergence theorems which justify the accuracy of 
approximating a finite system using the fiuid model approach. The proofs for the 
theorems stated here will be developed in later chapters. 

3.1 Definition of Fluid Model 

Definition 1. (Fluid Model) Given an initial condition v*^ € V , a function v(t) : 
[0,oo) ^ V is said to he a solution to the fluid model (or fluid solution for 

short) if: 

r^; v(0)=vO; 
(2) for allt>0, 

vo(t)-vi(t) = l, (3.1) 

an(il>v,(t)-Vi^i(t)>v,^i(t)-v,^2(t)>0, Vz > 0; (3.2) 
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(3) for almost all t £ [0, oo), and for every i > 1, Vj(t) is differentiahle and satisfies 
</i{t) = A(vi_i-Vi) -(l-p)(vi- Vi+i)- t/i(v), (3.3) 

where 



p, Vi > 0, 




5fi(v) = - min{Avi_i,p}, Vi = 0,Vi_i>0, 


(3.4) 


0, Vi = 0,Vi_i =0. 




We can write Eq. fl3.3|) more compactly as 




v(t) = F(v), 


(3.5) 


where 




Fi (v) = A (vi_i - Vi) - (1 -p) (vi - Vi+i) - Qi (v) . 


(3.6) 



We call F (v) the drift at point v. 

Interpretation of the fluid model. The solution to the fluid model, v(t), 
can be thought of as a deterministic approximation to the sample paths of V^(t) 
for large values of A^. Conditions (1) and (2) correspond to initial and boundary 
conditions, respectively. The boundary conditions reflect the physical constraints of 
the flnite system. In particular, the condition that vo(t) - vi(t) = 1 corresponds to 
the fact that 

Vo^(t)-Vf(t)^S^(t) = l, (3.7) 

where S^(t) is the fraction of queues with a non- negative queue length, which is by 
deflnition 1. Similarly, the condition that 

l>V,(t)-V,;+i(t)>V,;+i(t)-V,;+2(t)>0, V^ > 0, (3.8) 

is a consequence of 

(Vf (t) - Vfi,(t)) - (Vi:,(t) - V^t)) ^ Sf (t) - SUt) e [0, 1], (3.9) 

where Sf^(t) - Sf[^(t) is the faction of queues at time t with exactly i tasks, which 
is by deflnition between and 1. 

We now provide some intuition for each of the drift terms in Eq. (13. 3p : 
I. A(vj_i-Vj): This term corresponds to arrivals. When a task arrives at a 
station with i-1 tasks, the system has one more queue with i tasks, and Sf^ increases 
by ^. However, the number of queues with at least j tasks, for j 4^ i, does not change. 
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Thus, Sf is the only one that is incremented. Since V^^ = Y,'k=i ^k^ ^^is imphes that 
\f is increased by -^ if and only if a task arrives at a queue with at least i - 1 
tasks. Since all stations have an identical arrival rate A, the probability of V^^ being 
incremented upon an arrival to the system is equal to the fraction of queues with at 
least i-1 tasks, which is Vj^j^(t) - V^(t). We take the limit as A^ ->■ oo, multiply by 
the total arrival rate, A^A, and then multiply by the increment due to each arrival, 
jj, to obtain the term A (vj_i - Vj). 

II. (1 - p) (vj - Vj+i): This term corresponds to the completion of tasks due to 
local service tokens. The argument is similar to that for the first term. 

III. Qi (v): This term corresponds to the completion of tasks due to central service 
tokens. 

1. gi (v) = p, if Vj > 0. If z > and Vj > 0, then there is a positive fraction 
of queues with at least i tasks. Hence the central server is working at full 
capacity, and the rate of decrease in Vj due to central service tokens is equal 
to the (normalized) maximum rate of the central server, namely p. 

2. gi (v) = min{Avj_i,p} , if Vj = 0, Vj_i > 0. This case is more subtle. Note that 
since Vj = 0, the term Avj_i is equal to A(vi_i - Vj), which is the rate at which 
Vj increases due to arrivals. Here the central server serves queues with at least 
i tasks whenever such queues arise, to keep Vj at zero. Thus, the total rate of 
central service tokens dedicated to Vj matches exactly the rate of increase of 
Vj due to arrivalslll 

3- gi (v) = 0, if Vj = Vj_i = 0. Here, both Vj and Vj_i are zero and there are no 
queues with z - 1 or more tasks. Hence there is no positive rate of increase in Vj 
due to arrivals. Accordingly, the rate at which central service tokens are used 
to serve stations with at least i tasks is zero. 

Note that, as mentioned in the introduction, the discontinuities in the fluid model 
come from the term g'(v), which reflects the presence of a central server. 



^Technically, the minimization involving p is not necessary: if Avj_i(t) > p, then Vj(i) 
cannot stay at zero and will immediately increase after t. We keep the minimization just 
to emphasize that the maximum rate of increase in Vj due to central service tokens cannot 
exceed the central service capacity p. 



23 



3.2 Analysis of the Fluid Model and Exponential 
Phase Transition 

The following theorem characterizes the invariant state for the fluid model. It will 
be used to demonstrate an exponential improvement in the rate of growth of the 
average queue length as A -> 1 (Corollary |3]). 

Theorem 2. The drift F(-) in the fluid model admits a unique invariant state v^ 
(i.e., F(v^) = 0). Letting sf = v,f - v,f_^-^ for all i > 0, the exact expression for the 
invariant state as follows: 

(1) lfp = 0, thensl = \\ Vi > 1. 

(2) Ifp>X, then si = 0, Vz > 1. 

(3) IfO<p<X, and X = l- p, theM 



1-(t^)^' l<2<^feA) 



0, 



i>i*{p,X), 



p 



where i* (p, A) 
(4) If < p < X, and A ^ 1 - p, then 



.jj^q^fer-MfTA)' i<.<.^(p,A) 

0, i>i*{p,X), 



where 



t*{p,X) 



log 



p 



— 1-A 



(3.10) 



Proof. The proof consists of simple algebra to compute the solution to F(v^) = 0. 
The proof is given in Section 16.11 D 

Case (4) in the above theorem is particularly interesting, as it reflects the system's 
performance under heavy load (A close to 1). Note that since s{ represents the 
probability of a typical queue having at least i tasks, the quantity 



I ^ \^ I 



(3.11) 



i=l 



^Here [x\ is defined as the largest integer that is less than or equal to x. 
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represents the average queue length. The following corollary, which characterizes the 
average queue length in the invariant state for the fluid model, follows from Case (4) 
in Theorem [2] by some straightforward algebra. 

Corollary 3. (Phase Transition in Average Queue Length Scaling) // < 

p < X and Xi^ 1 - p, then 



I ^ \^ I 



i=l 



(1- 


-P)(l-A) 


(1 


-P-xf 



Vl-pj 



\ y*(p,A) 
p 



\-p-X 



(3.12) 



i-p 



with i* (p. A) = log^^ j— J . In particular, this implies that for any fixed p > 0, \[ 
scales as 



v( - i* (p, X) ~ log^ 



— 1-A 



as A ^ 1. 



(3.13) 



The scaling of the average queue length in Eq. (I3.13P with respect to arrival rate 
A is contrasted with (and is exponentially better than) the familiar j— j scaling when 
no centralized resource is available (p = 0). 



Intuition for Exponential Phase Transition. The exponential improvement 
in the scaling of v( is surprising, because the expressions for s[ look ordinary and 
do not contain any super-geometric terms in i. However, a closer look reveals that 
for any p > 0, the tail probabilities s^ have finite support: s^ "dips" down to 
as i increases to i*{p,X), which is even faster than a super-geometric decay. Since 
< s[ < 1 for all i, it is then intuitive that v( = Ei=i s[ is upper-bounded by i*{p, A), 
which scales as log^_ j^ as A -> 1. Note that a tail probability with "finite-support" 

implies that the fraction of stations with more than i*{p, A) tasks decreases to zero as 
N ^ oo. For example, we may have a strictly positive fraction of stations with, say, 
10 tasks, but stations with more than 10 tasks hardly exist. While this may appear 
counterintuitive, it is a direct consequence of centralization in the resource allocation 
schemes. Since a fraction p of the total resource is constantly going after the longest 
queues, it is able to prevent long queues (i.e., queues with more than i*(p,X) tasks) 
from even appearing. The thresholds i*{p,X) increasing to infinity as A -> 1 reflects 
the fact that the central server's ability to annihilate long queues is compromised by 
the heavier traffic loads; our result essentially shows that the increase in i*(X,p) is 
surprisingly slow. 
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Figure 3-1: Values of s[, as a function of i, for p = and p = 0.05, with traffic intensity 
A = 0.99. 



Numerical Results: Figure 13-11 compares the invariant state vectors for the 
case p = (stars) and p = 0.05 (diamonds). When p = 0, sf decays exponentially 
as A% while when p = 0.05, s^ decays much faster, and reaches zero at around i = 
40. Figure 133] demonstrates the exponential phase transition in the average queue 
length as the traffic intensity reaches 1, where the solid curve, corresponding to a 
positive p, increases significantly slower than the usual ^^ delay scaling (dotted 
curve). Simulations show that the theoretical model offers good predictions for even 
a moderate number of servers (A^ = 100). The detailed simulation setup can be found 
in Appendix B. Table IXTj gives examples of the values for i*{p,X); note that these 
values in some sense correspond to the maximum delay an average customer could 
experience in the system. 



Theorem [2] characterizes the invariant state of the fluid model, without saying if 
and how a solution of the fluid model reaches it. The next two results state that given 
any flnite initial condition, the solution to the fluid model is unique and converges 
to the unique invariant state as time goes to inflnity. 

Theorem 4. (Uniqueness of Solutions to Fluid Model) Given any initial con- 
dition v° e V , the fluid model has a unique solution v(v°,t), t e [0, cx)). 

Proof. See Section [621 □ 
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Figure 3-2: Illustration of the exponential improvement in average queue length from 
O(y^) to 0(log Y^) as A -^ 1, when we compare p = to p = 0.05. 
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Table 3.1: Values of i*{p,X) for various combinations of (jo, A). 



Theorem 5. (Global Stability of Fluid Solutions) Given any initial condition 
v° € V , and with v(v'^,t) the unique solution to the fluid model, we have 



lim||v(v°,t)-v^|| =0, (3.14) 

where v^ is the unique invariant state of the fluid model given in Theorem 
Proof. See Section [631 □ 
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3.3 Convergence to a Fluid Solution - Finite Hori- 
zon and Steady State 

The two theorems in this section justify the use of the fluid model as an approxima- 
tion for the finite stochastic system. The first theorem states that as A^ ^ oo and 
with high probability, the evolution of the aggregated queue length process V^(t) 
is uniformly close, over any finite time horizon [0,T], to the unique solution of the 
fluid model. 

Theorem 6. (Convergence to Fluid Solutions over a Finite Horizon) Con- 
sider a sequence of systems as the number of servers N increases to infinity. Fix any 
T > 0. If for some \^ eV , 



lim P(||V^(0)-v°||^>7)=0, V7>0, (3.15) 



then 



lim P( sup ||V^(t)-v(v°,t)|U>7i =0, V7 > 0. (3.16) 

^^°° \te[0,T] J 

where v (v°,t) is the unique solution to the fluid model given initial condition v''. 

Proof. See Section 1^751 D 

Note that if we combine Theorem [6] with the convergence of v(t) to v^ in Theorem 
[5l we see that the finite system (V^) is approximated by the invariant state of the 
fluid model v^ after a fixed time period. In other words, we now have 

lim lim V^(t) = v^, in distribution. (3-17) 

If we switch the order in which the limits over t and A^ are taken in Eq. (I3.17p . we are 
then dealing with the limiting behavior of the sequence of steady-state distributions 
(if they exist) as the system size grows large. Indeed, in practice it is often of great 
interest to obtain a performance guarantee for the steady state of the system, if it 
were to run for a long period of time. In light of Eq. (I3.17p . we may expect that 

lim limV^(t)=v^, in distribution. (3.18) 

The following theorem shows that this is indeed the case, i.e., that a unique steady- 
state distribution of v^(f) (denoted by tt^) exists for all A^, and that the sequence 
Tc^ concentrates on the invariant state of the fluid model (v^) as A^ grows large. 
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Theorem 7. (Convergence of Steady-state Distributions to v^) Denote by 
T^°° the a -algebra generated by V . For any N, the process V^(t) is positive recur- 
rent, and it admits a unique steady-state distribution n^ . Moreover, 



lim vr^ 



6^1, in distribution. 



(3.19) 



where 5^i is a probability measure on ^y~ that is concentrated on v^, i.e., for all 

' 1, V^€X, 

0, otherwise. 



K'{X) 



Proof. The proof is based on the tightness of the sequence of steady-state distribu- 
tions TT^, and a uniform rate of convergence of V^(t) to v(t) over any compact set 
of initial conditions. The proof is given in Chapter [71 D 
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Figure 3-3: Relationships between convergence results. 



Figure 13-31 summarizes the relationships between the convergence to the solution 
of the fluid model over a flnite time horizon (Theorem [5] and Theorem [6]) and the 
convergence of the sequence of steady-state distributions (Theorem [7]). 
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Chapter 4 

Probability Space and Coupling 



Starting from this chapter, the remainder of the thesis will be devoted to proving the 
results summarized in Chapter [3l We begin by giving an outline of the main proof 
techniques, as well as the relationships among them, in Section 14.11 The remainder 
of the current chapter focuses on constructing probability spaces and appropriate 
couplings of stochastic sample paths, which will serve as the foundation for later 
analysis. 

4.1 Overview of Technical Approach 

We begin by coupling the sample paths of processes of interest (e.g., V^(-)) with 
those of two fundamental processes that drive the system dynamics (Section 14. 2p . 
This approach allows us to link deterministically the convergence properties of the 
sample paths of interest to those of the fundamental processes, on which probabilistic 
arguments are easier to apply (such as the Functional Law of Large Numbers). Using 
this coupling framework, we show in Chapter [5] that almost all sample paths of V^(-) 
are "tight" in the sense that, as A^ ^ co, they are uniformly approximated by a set 
of Lipschitz-continuous trajectories, which we refer to as the fluid limits, and that 
all such fluid limits are valid solutions to the fluid model. This result connects the 
finite stochastic system with the deterministic fluid solutions. Chapter [6] studies the 
properties of the fluid model, and provides proofs for Theorem H] and [51 Note that 
Theorem [6] (convergence of V^(-) to the unique fluid solution, over a finite time 
horizon) now follows from the tightness results in Chapter [5] and the uniqueness 
of fluid solutions (Theorem Hj). The proof of Theorem [2] stands alone, and will be 
given in Section 16. 1[ Finally, the proof of Theorem [7] (convergence of steady state 
distributions to v^) is given in Chapter [71 
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The goal of the current chapter is to formally define the probability spaces and 
stochastic processes with which we will be working in the rest of the thesis. Specif- 
ically, we begin by introducing two fundamental processes, from which all other 
processes of interest (e.g., V^(-)) can be derived on a per sample path basis. 

4.2 Definition of Probability Space 

Definition 8. (Fundamental Processes and Initial Conditions) 

(1) The Total Event Process, {W{t)}^^Q, defined on a probability space {flw , ^w ,^w) , 
is a Poisson process with rate A + 1, where each jump marks the time when an 
"event" takes place in the system. 

(2) The Selection Process, {U{n)}^^j^ , defined on a probability space {Qu , !Fu ,^u) , 
is a discrete-time process, where each U{n) is independent and uniformly dis- 
tributed in [0,1]. This process, along with the current system state, determines 
the type of each event (i.e., whether it is an arrival, a local token generation, or 
a central token generation). 

(3) The (Finite) Initial Conditions, {VC^'^^jjveN; is a sequence of random vari- 
ables defined on a common probability space (r2o,^oiPo)> with V^^'^) taking val- 
ues^ inV n Q^ . Here, VC^'^) represents the initial queue length distribution. 

For the rest of the thesis, we will be working with the product space 

(fi,J^,P) = (l]vKxfic/xfio,^VKX-^C/X^O,PvKxPc/xPw/)- (4.1) 

With a slight abuse of notation, we use the same symbols W{t), U{n) and V'^'''^) 
for their corresponding extensions on Q, i.e. W{u,t) = W{uw,t), where u ^Q and 
u = {uw,^Ui^o)- The same holds for U and V(°'^). 

4.3 A Coupled Construction of Sample Paths 

Recall the interpretation of the fluid model drift terms in Section 13.11 Mimicking 
the expression of Vj(t) in Eq. (13. 3p . we would like to decompose Vf^(t) into three 



^For a finite system of N stations, the measure induced by Vf (i) is discrete and takes positive 
values only in the set of rational numbers with denominator N . 
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non-decreasing right- continuous processes, 

Vf(t)=Vf(0)-fAf(t)-Lf(t)-Cf(t), z>l, (4.2) 

so that A^(t), Ijf(t), and Cf(t) correspond to the cumulative changes in V^^ due 
to arrivals, local service tokens, and central service tokens, respectively. We will 
define processes A^(t),L^(t), C^{t), and V^(t) on the common probability space 
{Q,!F,F), and couple them with the sample paths of the fundamental processes W{t) 
and U{n), and the value of V^"'^), for each sample u e Q. First, note that since 
the iV-station system has A^ independent Poisson arrival streams, each with rate A, 
and an exponential server with rate A^, the total event process for this system is a 
Poisson process with rate A^(l + A). Hence, we define W^{u,t), the A^th normalized 
event process, as 

W^{u,t) = ^W{uj,Nt), \^t>0,ujeQ. (4.3) 

Note that W^(u,t) is normalized so that all of its jumps have a magnitude of ■^. 



I I II I I I 1 



F^.(a,,^,-)-F™,(co,^, 



-^ i-^^ 1 

" 1+A 1+A 

^ ^ ^^ ^ ^' V ' 



(1) (2) (3) 

Figure 4-1: Illustration of the partition of [0, 1] for constructing \^(u, •). 

The coupled construction is intuitive: whenever there is a jump in W^{u,-), 
we decide the type of event by looking at the value of the corresponding selection 
variable U(u,n) and the current state of the system V'^(oj,t). Fix u in Q, and let 
tk, k >1, denote the time of the kth jump in W^{u, •). 

We first set all of A^, L^, and C^ to zero for t e [0,ti). Starting from k = 1, 
repeat the following steps to for increasing values of k. The partition of the interval 
[0,1] used in the procedure is illustrated in Figure I4l3l 

(1) If U{Lo,k) e Y^[0,Vf^(a;,tfe-)-Vf (a;,tfc-)) for some i > 1, the event corre- 
sponds to an arrival to a station with at least i - 1 tasks. Hence we increase 
Af(u,t) by jj: at all such i. 
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(2) If U{u,k) e ^ + |^[0,Vf(u;,4-)-V,^,(u;,tfe-)) for some z > 1, the event 
corresponds to the completion of a task at a station with at least i tasks due to 
a local service token. We increase L^(a;,t) by -^ at all such i. Note that i = 
is not included here, reflecting the fact that if a local service token is generated 
at an empty station, it is immediately wasted and has no impact on the system. 

(3) Finally, if U{u, k) e j^ + j^ + [O, j^) = [l - jfx' ^)' ^^^ event corresponds to 
the generation of a central service token. Since the central service token 
is alway sent to a station with the longest queue length, we will have a task 
completion in a most-loaded station, unless the system is empty. Let i*{t) be 
the last positive coordinate oi\^{u,t-), i.e., i*(t) = sup{i : V^(a;,t-) > 0}. We 
increase C^(a;,t) by -^^ for all j such that 1 < j < i*{tk). 



To finish, we set V^(a;,t) according to Eq. (14.21) . and keep the values of all 
processes unchanged between t^ and tk+i- We set V^ - Y^ + 1, so as to stay 

N 




consistent with the definition of V^ 
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Chapter 5 

Fluid Limits of Stochastic Sample 
Paths 



In this chapter, we formally establish the connections between the stochastic sample 
paths (V^(-)) and the solutions to the fluid model (v(-)). Through two important 
technical results (Propositions [TT] and [T2l) . it is shown that, as A^ ^ oo and almost 
surely, any subsequence of {V^(-)}^>]^ contains a subsequence that convergences 
uniformly to a solution of the fluid model, over any finite horizon [0,r]. This 
provides strong justification for using the fluid model as an approximation for the 
stochastic system over a finite time period. However, we note that results presented 
in this chapter do not imply the converse, that any solution to the fluid model 
corresponds to a limit point of some sequence of stochastic sample paths. This 
issue will be resolved in the next chapter where we show the important property of 
the uniqueness of fluid solutions. The results presented in chapter, together with 
the uniqueness of fluid solutions, will then have estabhshed that the fluid model fully 
characterizes the transient behavior of sufficiently large finite stochastic systems over 
a finite time horizon [0,T]. 

In the sample-path wise construction in Section 14. 3^ all randomness is attributed 
to the initial condition V'^^'^) and the two fundamental processes W {■) and U{-). 
Everything else, including the system state V^(-) that we are interested in, can be 
derived from a deterministic mapping, given a particular realization of V(°'^), W{-), 
and U{-). With this in mind, the approach we will take to prove convergence to a 
fiuid limit (i.e., a limit point of {V^(-)}^>j^), over a finite time interval [0,T], can 
be summarized as follows. 

(1) (Lemma in]) Define a subset C of the sample space Q, such that P(C) = 1 and the 
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sample paths of W and U are sufficiently "nice" for every u eC. 

(2) (Proposition [TTj) Show that for all u in this nice set, the derived sample paths 
V^(-) are also "nice", and contain a subsequence converging to a Lipschitz- 
continuous trajectory v(-), as N ^ oo. 

(3) (Proposition [T2l) Characterize the derivative at any regular points of v(-) and 
show that it is identical to the drift in the fluid model. Hence v(-) is a solution 
to the fluid model. 

The proofs will be presented according to the above order. 

5.1 Tightness of Sample Paths over a Nice Set 

We begin by proving the following lemma which characterizes a "nice" set C c 1] 
whose elements have desirable convergence properties. 

Lemma 9. Fix T > 0. There exists a measurable set C oQ such that P(C) = 1 and 
for all u eC, 

lim sup |Vf^^(a;,t)-(l + A)d = 0, (5.1) 

1 ^ 

1™ 77E^h:fc)(^('^'0) = ^-o, ifa<band[a,b)c[0,l]. (5.2) 

-' ' 1=1 

Proof. Based on the Functional Law of Large Numbers for Poisson processes, we can 
find Cw <= ^ly, with F^ (Cw) = 1, over which Eq. (15. ip holds. For Eq. (15. 2p . we 
invoke the Glivenko-Cantelli theorem, which states that the empirical measures of a 
sequence of i.i.d. random variables converge uniformly almost surely, i.e.. 



lim sup 



1 ^ 

-^V.)(f/(z))-x 



0, almost surely. (5.3) 



This implies the existence of some Cu <= ^u, with P^/ (Cu) = 1, over which Eq. (15. 2 p 
holds. (This is stronger than the ordinary Strong Law of Large Numbers for i.i.d. 
uniform random variables on [0,1], which states convergence for a fixed set [0,x).) 
We finish the proof by taking C = Cw x C^; x f^Q. D 



^Regular points are points where derivative exists along all coordinates of the trajectory. Since 
the trajectory is Lipschitz-continuous along every coordinate, almost all points are regular. 
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Definition 10. We call the 4-tuple, X^ = (V^, A^,L^,C^), the Nth system. 

Note that all four components are infinite- dimensional processes^ 

Consider the space of functions from [0,T] to R tliat are riglit-continuous-witli- 
left-limits (RCLL), denoted by D[0,T], and let it be equipped with the uniform 
metric, d (■,■): 

d{x,y)= sup \x{t)-y{t)\, x,y e D[0,T]. (5.4) 

te[0,T] 

Denote by D°°[0,T] the set of functions from [0,T] to ]R^+ that are RCLL on every 
coordinate. Let rf^+(-,-) denote the uniform metric on D°°[0,T]: 

d^-(x,y)= sup ||x(t)-y(t)||^, x,y e D^-[0,r], (5.5) 

i6[0,T] 

with \\ ■ \\w defined in Eq. (12.101) . 

The following proposition is the main result of this section. It shows that for suffi- 
ciently large A^, the sample paths are sufficiently close to some absolutely continuous 
trajectory. 

Proposition 11. Fix T > 0. Assume that there exists some v'' € V such that 

lim ||V^(u;,0)-v°|U = 0, (5.6) 

N-*oo 

for all u € C. Then for all u e C, any subsequence of {X^(u;,-)} contains a fur- 
ther subsequence, {X^'(a;,-)}, that converges to some coordinate-wise Lipschitz- 
continuous function x(t) = (v (t) ,a(t) ,l(t) ,c (t)), with v (0) = v*^, ci(0) = 1(0) = 
c(0) = and 

|xi(a)-Xi(6)|<L|a-6|, Va,fee[0,r], ieZ+, (5.7) 

where L > is a universal constant, independent of the choice of u, x and T. Here 
the convergence refers to d^+(V^',v), (i^+(A^%a), d^+(L^%l), and d^+{C'^^,c) all 
converging to 0, as i ^ oo. 

For the rest of the thesis, we will refer to such a limit point x, or any subset of 
its coordinates, as a fluid limit. 

Proof outline: Here we lay out the main steps of the proof; interested readers 
are referred to Appendix lA.ll for a complete proof. 

We first show that for all w e C, and for every coordinate i, any subsequence 
of {Xj^ (uj,-)} has a convergent subsequence with a Lipschitz-continuous limit. We 



^If necessary, X^ can be enumerated by writing it explicitly as X^ 
(V„^,A^,Lo^,C^Vf,Af,...). 
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then use the coordinate- wise hmit to construct a hmit point in the space D^+. To 
estabhsh coordinate-wise convergence, we use a tightness technique previously used 
in the hterature of multiclass queueing networks (see, e.g., [1]). A key reahzation in 
this case, is that the total number of jumps in any derived process A^, L-^, and C^ 
cannot exceed that of the event process W^{t) for any particular sample path. Since 
A^, L^, and C^ are non-decreasing, we expect their sample paths to be "smooth" 
for large A^, due to the fact that the sample path of W^{t) does become "smooth" 
for large N, for all u € C (Lemma [9l). More formally, it can be shown that for 
all a; e C and T > 0, there exist diminishing positive sequences Mjv i and 77V i 0, 
such that the sample path along any coordinate of X^ is 77v-approximately-Lipschitz 
continuous with a uniformly bounded initial condition, i.e., for all i, 

|Xf(a;,0)-xO|<M^, 
and |Xf(a;,a)-Xf(a;,b)|<L|a-b| + 7^, Va,6e[0,T], 

where L is the Lipschitz constant, and T < 00 is a fixed time horizon. Using a 
linear interpolation argument, we then show that sample paths of the above form 
can be uniformly approximated by a set of L-Lipschitz-continuous function on [0, T]. 
We finish by using the Arzela-Ascoli theorem (sequential compactness) along with 
closedness of this set, to establish the existence of a convergent further subsequence 
along any subsequence (compactness) and that any limit point must also L-Lipschitz- 
continuous (closedness). This completes the proof for coordinate-wise convergence. 
With the coordinate-wise limit points, we then use a diagonal argument to con- 
struct the limit points of X^ in the space D^+[0,T]. Let vi be any L-Lipschitz- 

continuous limit point of Vf^, so that a subsequence Y^^ (u,-) ^ vi, as j -^ 00, with 
respect to d{-,-). Then, we proceed recursively by letting Vj+i be a limit point of a 

subsequence of < Y-_^\{u, ■) > , where { A"'}"^^ are the indices for the ith subsequence. 

We claim that v is indeed a limit point of V^ under the norm (i^+ (■,■). Note that 
since vi(0) = v^*, < Yf{t) < V^(t), and vi(-) is L-Lipschitz-continuous, we have 
that 

sup |v,(t)|< sup |vi(t)|<|v?| + Lr, VieZ+. (5.8) 

ie[0,T] HO,T] 

Set A^i = 1, and let, for k>2, 

Nk = minJA > A^.i : sup d(Vf (a;,-), v,) < U . (5.9) 

I l<i<k rC I 
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Note that the construction of v imphes that A'^^ is well defined and finite for all k. 
From Eqs. fl5.8p and (15. 9p . we have, for all k>2, 



rf^+(V^K^r),v) = sup 



te[0,T] 



\ i=0 



|V^-(^,0-v.(t)f 



1 

< - + 



OO 1 



- l^W-M-LT). (5.10) 

Hence (F'+ (V^fc(a;,-), v) ^ 0, as A; ^ oo. The existence of the limit points a(t), l(t) 
and c(t) can be established by an identical argument. This completes the proof. 

5.2 Derivatives of the Fluid Limits 

The previous section established that any sequence of "good" sample paths ({X^(a;, •)} 
with w e C) eventually stays close to some Lipschitz-continuous, and therefore abso- 
lutely continuous, trajectory. In this section, we will characterize the derivatives of 
v(-) at all regular ( different iable) points of such limiting trajectories. We will show, 
as we expect, that they are the same as the drift terms in the fluid model (Deflnition 
[1]). This means that all fluid limits of V^(-) are in fact solutions to the fluid model. 

Proposition 12. (Fluid Limits and Fluid Model) Fix u e C and T > 0. Let x 

be a limit point of some subsequence o/X^(aj,-), as in Proposition [771 Let t be a 
point of differentiability of all coordinates of ^. Then, for all i e N, 

ai(t) = A(v,_i-v,), (5.11) 

\,{t) = (l-p)(v,-v,+i), (5.12) 

c,{t) = r7,(v), (5.13) 

where g was defined in Eq. (13. 4p . with the initial condition v(0) = v° and boundary 
condition vo(t) - Vi(t) = l,Vt e [0,T]. In other words, all fluid limits ofV'^(-) are 
solutions to the fluid model. 

Proof. We flx some u ^C and for the rest of this proof we will suppress the dependence 
on u in our notation. The existence of Lipschitz-continuous limit points for the given 
Lu eC is guaranteed by Proposition [TTJ Let {X^fc(-)}^-^ be a convergent subsequence 
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such that hnifc^oo (^^*(X^'''(-),x) = 0. We now prove each of the three claims (Eqs. 
fl5.1ip -( !5rT3l) ) separately, and index i is always fixed unless otherwise stated. 

Claim 1: aj(t) = A(vi_i(t) - Vj(t)). Consider the sequence of trajectories 
[A^f' {■)}'^^^. By construction, Af{t) receives a jump of magnitude -^ at time t 
if and only if an event happens at time t and the corresponding selection random 
variable, U{-), falls in the interval itx [0)^i^i(^~) ~ ^i^(^~))- Therefore, we can 
write: 

AfHt + e)-Af'^-(t) = — Y. mUU))^ (5-14) 

where Ij = y^ [0,V^_\{tf'-) -Vf^itf"-)) and tf is defined to be the time of the 
jth jump in W^{-), i.e., 

tf = miL>0:W''{s)>j-\. (5.15) 

Note that by the definition of a fiuid limit, we have that 

lim (Af'=(t + e) - Af'it)) = a,(t + e) - a,(t). (5.16) 

The following lemma bounds the change in aj(t) on a small time interval. 
Lemma 13. Fix i and t. For all sufficiently small e > 

|ai(t + e) - a,(t) - eA(v,_i(t) - v,(t))| < 2^L (5.17) 

Proof. While the proof involves heavy notation, it is based on the fact that u ^ C: 
using Lemma [9l Eq. (15.171) follows from Eq. (I5.14p by applying the convergence 
properties of W^{t) (Eq. (O)) and U{n) (Eq. (lOD). 

For the rest of the proof, fix some a; e C. Also, fix z > 1, t > 0, and e > 0. Since 
the limiting function x is L-Lipschitz-continuous on all coordinates by Proposition 
[TT| there exists a non- increasing sequence 7„ i such that for all s e [t, t + e] and all 
sufficiently large /c, 

Vf (s) e [v,(t) - (eL + 7^J,v,(t) + (eL + 7^j), j e {z- l,z,z + 1}, (5.18) 
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The above leads too 

[0,Vf^,(.)-VfHs)).[0,[v,_i(t)-v,(t)-2(6L + 7Arjr), 

and [0,Vl\{s) - V^'is)) c [O, v,_i(t) - v,(t) + 2(eL + 7^,)), (5.19) 

for all sufficiently large k. 

Define the sequence of set-valued functions {?7"(t)} as 

^" W = T^ [0, v,_i(t) - v,(t) + 2(eL + 7J) . (5.20) 

i + A 

Note that since 7n i 0, 

T]^{t) ^ r^^^\t) and f) r/"(t) = -— [0, v,_i(t) - v,(t) + 2eL] . (5.21) 

n=l i + A 

We have for all sufficiently large /c, and any / such that 1 < I < Nk, 






AfHt + e)-A^(t) < — ^ V.(,)(f/(j)) 

j=NkWk(t)+l 



1 lNkW"k{Ue) NhW"k{t) 

- w\ S i,n*) (f^(j)) - E \^it){u{j)) 

(5.22) 

where the ffist inequality follows from the second containment in Eq. f l5.19p . and the 
second inequality follows from the monotonicity of {?7"'(t)} in Eq. fl5.2ip . 
We would like to show that for all sufficiently small e > 0, 

a,(f + e) - a,(t) - eA(v,_i(t) - v,(t)) < 2e^L (5.23) 

To prove the above inequality, we ffist claim that for any interval [a, h) c [0, 1], 

hm- Y. W)(^(0) = (A+l)t(6-a), (5.24) 



^Here [xY = max{0,x}. 
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To see this, rewrite the left-hand side of the equation above as 

. NW^(t) 



j=l 



1 



hm(A + l)t- , / 



(X+l)Nt 

E W)(^(0) 



(NW'^it) 



{X+l)Nt 



+ hm (A + l)t- ^ , / 



E ![,,,) (^«)- E I[,,,)(^«) .(5.25) 



•i=i 



Because the magnitude of the indicator function !{■} is bounded by 1, we have 

<A^|(A+l)t-H^^(t)|. (5.26) 
Since a; e C, by Lemma |9] we have that 






hm |(A + l)t-iy^(t)|=0, 



AT 



hm 



(A+l)7Vt 

^ I[„„5)(/7(2))=6-a, 



(5.27) 
(5.28) 



for any t < oo. Combining Eqs. (I5.25p - (l5.28p . we have 

1 w^(t) 

1 (\+l)Nt . 

= (A + l)thm- ^ y lub) iU(i)) + lim -\(X + l)t -W^(t)\ 

^ ^ N^oo{X + l)Nt tl iv^°o(A + l)t'^ ^ ^ ^' 

= (A + l)t(6-a), (5.29) 

which estabhshes Eq. (I5.24p . By the same argument, Eq. (I5.29P also holds when t is 



41 



replaced by t + e. Applying this result to Eq. (15.22^ . we have 

ai(t + e) -ai(t) 
= lim(A^(t + 6)-Af^(t)) 

< (t + e - t)(A + 1)-A^ [v,(t) - v,_i(t) + 2(eL + 7,)] 

A + i 

= eA(v,_i(t)-v,(t)) + A(2e2L + 2e70 

< eA(v,_i(t)-v,(t)) + 2e2L + 2e7,, (5.30) 

for alH > 1, where the last inequality is due to the fact that A < 1. Taking / -^ 00 
and using the fact that ji i 0, we have established Eq. (15. 23 p . 
Similarly, changing the definition of r]^{t) to 

we can obtain a similar lower bound 

Riit + e) - ai(t) - eA(v,_i(t) - v,(t)) > -2e^L, (5.32) 

which together with Eq. (I5.23P proves the claim. Note that if Vj(t) = v,j_i(t), the 
lower bound trivially holds because A^ ''(t) is a cumulative arrival process and is 
hence non- decreasing in t by definition. D 



Since by assumption a(-) is differentiable at t, Claim 1 follows from Lemma 
by noting ai(f) = lim^o ^'(*+^)''^'(*) . 

Claim 2: lj(t) = (l-p)(vj(t)-Vj+i(t)). Claim 2 can be proved using an identical 
approach to the one used to prove Claim 1. The proof is hence omitted. 

Claim 3: Cj(t) = gi (v). We prove Claim 3 by considering separately the three 
cases in the definition of v. 

(1) Case 1: Ci(t) = 0, if v^^i = 0, v^ = 0. Write 

c,(t)=ai(t)-i,(t)-v,(t). (5.33) 

We calculate each of the three terms on the right-hand side of the above equation. 
By Claim 1, aj(t) = A(vj_i - v^) = 0, and by Claim 2, lj(t) = A(vj - Vj+i) = 0. 
To obtain the value for Vj(t), we use the following trick: since Vj(t) = and 
Vj is non-negative, the only possibility for Vj(t) to be differentiable at t is that 
Vj(t) = 0. Since aj(t), lj(t), and Vj(t) are all zero, we have that Cj(t) = 0. 
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(2) Case 2: Ci(t) = min{Avj_i,j9}, if Vj = 0, Vj_i > 0. 

In this case, the fraction of queues with at least i tasks is zero, hence Vj receives 
no drift from the local portion of the service capacity by Claim 2. First consider 
the case Vj_i(t) < j. Here the line of arguments is similar to the one in Case 1. 
By Claim 1, aj(t) = A(vj_i - v^) = Avi_i, and by Claim 2, lj(t) = A(vj - Vj+i) = 0. 
Using again the same trick as in Case 1, the non-negativity of Vj and the fact that 
Vi(t) = together imply that we must have Vj(f) = 0. Combining the expressions 
for aj(t), lj(t), and Vj(t), we have 

Ci{t) = -Vi(t) + a,(t) - hit) = Av,_i. (5.34) 

Intuitively, here the drift due to random arrivals to queues with i-1 tasks, Avj_i, 
is "absorbed" by the central portion of the service capacity. 

If Vj_i(t) > J, then the above equation would imply that Ci(t) = Avj_i(t) > 
p, if Cj(t) exists. But clearly Cj(t) < p. This simply means Vj(t) cannot be 
differentiable at time t, if Vj(t) = 0,Vj_i(t) > j. Hence we have the claimed 
expression. 

(3) Case 3: Cj(t) = p, if Vj > 0, Vj+i > 0. 

Since there is a positive fraction of queues with more than i tasks, it follows that 
'Vf is decreased by -^ whenever a central token becomes available. Formally, 
for some small enough e, there exists K such that V^ ''(s) > for all k > K, s e 
[t,t + e]. Given the coupling construction, this implies for all k > K, s e [t,t + e] 

vfH^)-vfHt) = - E i[i--„i)(f/a))- 

Using the same arguments as in the proof of Lemma [131 "we see that the right- 
hand side of the above equation converges to {s-t)p+o{e) as /c -> oo. Hence, Vj(t) = 

lim^ohmfc^oo — z — ■ =p. 



Finally, note that the boundary condition vo(t) - vi(t) = 1 is a consequence of the 
fact that V|^(t)-Vj^(t) = S^(t) = 1 for all t. This concludes the proof of Proposition 

m □ 
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Chapter 6 

Properties of the Fluid Model 



In this chapter, we estabhsh several important properties of the fluid model. We 
begin by proving Theorem [2] in Section 16.11 which states that the fluid model admits 
a unique invariant state for each pair of p and A. Section 16.21 is devoted to prov- 
ing the important property that for any initial condition v° e V , the fluid model 
admits a unique solution v(-). As a corollary, it is shown that the fluid solution 
v(-) depend continuously on the initial condition v°, and this technical result will 
be important for proving the steady-state convergence theorem in the next chapter. 
Using the uniqueness result and the results from the last chapter, one of our main 
approximation theorems, Theorem El is proved in Section 16.31 which establishes the 
convergence of stochastic sample paths to the unique solution of the fluid model over 
any finite time horizon, with high probability. Finally, in Section 16.41 we prove that 
the solutions to the fluid model are globally stable (Theorem [5]), so any solution v(t) 
converges to the unique invariant state v^ as t ^ oo. This suggest that the qualitative 
properties derived from v^ serves as a good approximation for the transient behavior 
of the system. We note that by the end of this chapter, we will have establish all 
transient approximation results, which correspond to the path 



V~(t)^^^v(t)^-^v^ (6.1) 

as was illustrated in Figure 13-31 of Chapter |3l The other path in Figure 13-31 namely, 
the approximation of the steady-state distributions of V^(-) by v^, will be dealt 
with in the next chapter. 
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6.1 Invariant State of the Fluid Model 

In this section we prove Theorem [21 which gives exphcit expressions for the (unique) 
invariant state of the fluid model. 

Proof. (Theorem [2]) In this proof we will be working with both v^ and s^, with 
the understanding that s[ = v^^ - vf^^, Vz > 0. It can be verified that the expressions 
given in all three cases are valid invariant states, by checking that F(v'^) = 0. We 
show they are indeed unique. 

First, note that ii p > X, then Fi(v) < whenever vi > 0. Since v{ > 0, we must 
have v( = 0, which by the boundary conditions implies that all other v^ must also 
be zero. This proves case (2) of the theorem. 

Now suppose that < p < X. We will prove case (4). We observe that Fi(v) > 
whenever vi = 0. Hence v( must be positive. By Eq. (13. 4p this implies that (7i(v^) = p. 
Substituting 5'i(v^) in Eq. (13. 3p . along with the boundary condition Vg-v( = Sq = 1, 
we have 

= X-l-(l-p)s{-p, (6.2) 

which yields 



1-p 
Repeating the same argument, we obtain the recursion 



(6.3) 



^? = ^. (6.4) 

1 -p 

for as long as s[ (and therefore, v[) remains positive. Combining this with the 
expression for s(, we have 

^^^TTT^d^^j'-TTffTA). l^'^'-fcA). (6.5) 

where i* {p, X) = log^^ -^ marks the last coordinate where s[ remains non-negative. 

This proves uniqueness of s^ up to z < i* (p, A). We can then use the same argument 
as in case (2), to show that s[ must be equal to zero for all i > i* {p,X). Cases (1) 
and (3) can be established using similar arguments as those used in proving case 
(4). This completes the proof. D 
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6.2 Uniqueness of Fluid Limits & Continuous De- 
pendence on Initial Conditions 

We now prove Theorem HI which states that given an initial condition v'^ e V , a 
solution to the fluid model exists and is unique. As a direct consequence of the proof, 
we obtain an important corollary, that the unique solution v(-) depends continuously 
on the initial condition v". 

The uniqueness result justifles the use of the fluid approximation, in the sense that 
the evolution of the stochastic system is close to a single trajectory. The uniqueness 
along with the continuous dependence on the initial condition will be used to prove 
convergence of steady-state distributions to v^ (Theorem [7]). 

We note that, in general, the uniqueness of solutions is not guaranteed for a 
differential equation with a discontinuous drift (see, e.g., [H]). In our case, F(-) 
is discontinuous on the domain V due to the drift associated with central service 
tokens (Eq. (^;^). 

Proof. (Theorem |4]) The existence of a solution to the fluid model follows from 
the fact that V^ has a limit point (Proposition [TTj) and that all limit points of V^ 
are solutions to the fluid model (Proposition [T2l) . We now show uniqueness. Deflne 
zP(v) = sup{z : Vj > 0}lll Let v(t),w(t) be two solutions to the fluid model such 
that v(0) = v° and w(0) = w°, with v'^,w'^ € V . At any regular point t > 0, where 
all coordinates of v(t),w(t) are differentiable, without loss of generality, assume 
zP(v(t)) < iP{w{t)), with equality if both are inflnite. Let a"^(-) and a"*^(-) be the 
arrival trajectories corresponding to v(-) and w(-), respectively, and similarly for 1 
and c. Since vo(t) = vi(t) + 1 for all t > by the boundary condition, and vi = 
a]' - 1^ - c^, for notational convenience we will write 

vo = a^-io-Co, (6.6) 

where 



A .,, i,, A :,, 1 .-,, A 



ao = ^X, lo = ir, and c^ = c^ (6.7) 

The same notation will be used for w(t). 



^F(v) can be infinite; this happens if all coordinates of v are positive. 
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We have, 



d .. u2 A d ^ |Vi - Wi| (a) ^ (Vj - Wj) (Vj - Wi) 

— V - w = — > = > 

f (v.-wo[(ar-ir)-(ar-ir)] _ ^ (v, - wo (c- - c- ) 



•j=0 



)i-l 



)j-l 



2 ^"(^^ 1 

C||v-w||^- ^ — Y(v»-Wi)(p-p) 



j=0 



(0-w,p(v)+i)(min{AviP(v),p}-p) 



2ip(v) 
- E ^(0-w.)(0-p) 

z-zP(v)+2 ^ 
oo 1 

- E ^(o-o)(cr-cr) 

j=jP(w)+l ^ 
/ r^ II II 2 

< 6 v-wL„, 



(6.8) 



where C = 6{\ + l-p). We first justify the existence of the derivative ^ ||v - w||^ and 
the exchange of hniits in (a). Because Vj(t) and Wj(f) are L-Lipschitz-continuous for 
all i, it follows that there exists L' > such that for all i, h{i, s) = |vj(s) - Wj(s)| is 
L'-Lipschitz-continuous in the second argument, within a small neighborhood around 
s = t. In other words, 

h{i,t + e) - h{i,t) 



<L' 



(6.9) 



for all i and all sufficiently small e. Then, 



d „ 

— V - w 

dt 



hm £ 2' 



■ h{i,t + e) - /i(z,t) 



eiO 



i=0 



= lim / 

40 JieZ+ 



h{i,t + e) - h{i,t) 



(i/iN, 



(6.10) 



where /u^ is a measure on Z+ defined by fin(i) = 2 \i e Z+. By Eq. (16. 9p and 
the dominated convergence theorem, we can exchange the limit and integration in 
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Eq. (I6.10p and obtain 

d II ||2 ,. r h(i,t + e)-h(i,t) 
-||v-w|L = hmj^^^^ d^. 

which justifies step (a) in Eq. fl6.8p . Step (6) follows from the fact that a and 1 are 
both continuous and linear in v (see Eqs. fl5.1ip -( 15.13p ). The specific value of C 
can be derived after some straightforward algebra, which we isolate in the following 
claim. 



Claim 14. 



^ (vi-Wi)r(a7-i7)-(a7'-ir)l 9 

i=o ^ 

Proof. Let rrij = Vj - Wj. Note that for alH > 1 

(v,-wo[(ar-ir)-(ar-ir)] 

= (vi - Wj) [A(vi_i - w,_i) - A (vi - Wj) - (1 - p) (vi -Wi) + {l-p) (vj+i - Wi+i)] 
= mj(Amj_i -Ami- (1 -J5)mi + (1 -p)mj+i) 

< - (mti + mf ) - (A + 1 - p) m^ + ^ (m^ + m^^^) 

= Amti + (l-p)m^^i- ^^ ^ m^ 

< Amti + (l-p)mL (6.13) 
For z = 0, by Eq. (16.71) . we have 

(vo-wo)[(aJ-ij)-(a^-i^)] = (vi - wi) [(a^ - 1^) - (af - if)] 

< Am^ + (l-p)m^ (6.14) 
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Combining Eqs. (16. 13^ and (16 .13^ . we have 

- (v,-wO[(ar-ir)-(ar-ir)] 

l—l oj-l 

OO 1 

< 2(Am2 + (1 - p)m2) + ^ TJIT (^"^-i + (^ " P)'^'^i) 

< 6(A+l-p)||v-w||^. (6.15) 

This proves the claim. D 

Now suppose that v° = w°. By Gronwall's inequahty and Eq. (16. Sp . we have 

II v(t) - w(t) i^ < ||v(0) - w(0) i^ e^* = 0, Vt e [0, oo), (6.16) 

which estabhshes the uniqueness of fluid limits on [0, oo). D 

The following corollary is an easy, but important, consequence of the uniqueness 
proof. 

Corollary 15. (Continuous Dependence on Initial Conditions) Denote by 
v(v°,-) the unique solution to the fluid model given initial condition v° e V . // 

C>0 ^11 

w"- e V for all n, and Iw*^ - v^^l^^, ^ as n ^ oo, then for all t>0, 

lim II v(w'^, t) - v(v°, t) 1^ = 0. (6.17) 

Proof. The continuity with respect to the initial condition is a direct consequence 
of the inequality in Eq. (I6.16P : if v(w",-) is a sequence of fluid limits with initial 
conditions w" e V and if ||w" - v°||^ ^ as A^ ^ cx), then for all t e [0, cx)), 

||v(v°, t) - v(w", t)f < ||v° - w"||^ e'^* ^0, as n ^ oo. 

II "^ ' -' ^ ' ^ W'W II llui 

This completes the proof. D 

6.2.1 v() versus s(), and the Uniqueness of Fluid Limits 



As was mentioned in Section I2T21 we have chosen to work primarily with the aggregate 
queue length process, V^(-) (Eq. (12. ip ). instead of the normalized queue length 
process, S^(-) (Eq. (12. 2p ). We offer some justification for such a choice in this 
section. 
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Recall that for any finite A^, the two processes are related by simple transforma- 
tions, namely, 

vf(t)^E-,sf(t), z>0. 

andSf(t)^Vf(t)-V,f[,(t), z>0. 

Therefore, there seems to be no obvious reason to favor one representation over the 
other when A^ is finite. However, in the limit oi N ^ cx), it turns out that the 
fluid model associated with V^(-) is much easier to work with in establishing the 
important property of the uniqueness of fluid solutions (Theorem H]). 

A key to the proof of Theorem HJis a contraction of the drift associated with v(-) 
(Eq. ( 16.8P ). also known as the one-sided Lipschitz continuity (OSL) condition in the 
dynamical systems literature (see, e.g., [H]). We first give a definition of OSL that 
applies to our setting. 

Definition 16. Let the coordinates o/M°° be indexed by'Z+ so that^ = (xo,Xi,X2, • . •) 
for all X e M°°. A function H : M°° -^ M°° is said to be one-sided Lipschitz- continuous 
(OSL) over a subset D c M°°, if there exists a constant C , such that for every ^,y e D, 

(x-y,H(x)-H(y))^<C||x-y||^, (6.18) 

where the inner product (•,-)^ on M°° is defined by 

(x,yL = f:^. (6.19) 

What is the usefulness of the above definition in the context of proving uniqueness 
of solutions to a fluid model? Recall that F(-) is the drift of the fluid model, as in 
Eq. dM]), i.e., 

v(t)=F(v(t)), (6.20) 

whenever v(-) is differentiable at t. Let v(-) and w(-) be two solutions to the fluid 
model such that both are differentiable at f , as in the proof of Theorem HI We have 

I II v(t) - w(t) 1^ = 2 (v(t) - w(t), F(v(t)) - F(w(t)))^ . (6.21) 

Therefore, if F(-) is one-sided Lipschitz-continuous, as defined by Eq. (16.181) . we 
immediately obtain the key inequality in Eq. (16.81) . from which the uniqueness of 
fluid solutions follows. The computation carried out in Eq. (16. 8 p was essentially 
verifying the OSL condition of F(-) on the domain V . 
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For the state representation based on s(-), can one use the same proof technique 
to show the uniqueness of s(-) by working directly with the drift associated with 
s(-)? Recall that 

Si(t)=Vi(t)-v,+i(t), Vz>0, (6.22) 

so that at all t where v(t) is differentiable, the drift s(t) is given by 

Hi(s(t)) = s,(t) = v,(t) - v,+i(t) = A(s,_i - s,) - (1 -p)(s, - s,+i) - gt{s), (6.23) 

for all i > 1, where ^^^(s) = gi{-v) - gi+l{^r), i.e., 

' 0, Si>0,s,+i >0, 

qUs) = ^ p-min{As,,p}, Si>0,s,+i=0, .^ ^4) 

yiK J j min{Asi_i,p}, Si = 0,s,_i>0, 

. 0, Sj = 0,Si_i =0. 

Interestingly, it turns out that the drift H(-), defined in Eq. (I6.23p . does not 
satisfy the one-sided Lipschitz continuity condition in general. We show this fact by 
inspecting a specific example. To keep the example as simple as possible, we consider 
a degenerate case. 

Claim 17. // A = and p = 1, then H(-) is not one-sided Lipschitz- continuous on its 
domain S , where S was defined in Eq. (12. 6p as 

5°° = j s e 5 : ^ s, < 00 I . 

Proof. We will look at a specific instance where the condition (I6.18P cannot be sat- 
isfied for any C. For the rest of the proof, a vector s e S will be written explicitly 
as s = (so,Si,S2, . . .). Consider two vectors 

s"= (l,a,0,0,...) ands'' = (l,a + e,/3,0,0,...), (6.25) 

where l>a + e>/3>0, l>a>0, and s^ = s^^ = for all i > 3. Note that s^, s^^ e S°° . 

To prove the claim, it suffices to show that for any value of C, there exist some 
values of a, /?, and e such that 

(s^ - s", H(s'') - H(s"))^ > C ||s^ - s"f . (6.26) 
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Since A = and p = 1, by the definition of H(-) (Eqs. (16. 23^ and (16. 24^ ). we liave 

H(s") = (0,-1,0,0,...) and H(s'') = (0,0, -1,0,0,...). (6.27) 

Combining Eqs. (16.25^ and (I6.27p . we liave 

s^'-s'^ = (0,e,/3,0,0,...), 
andH(s^)-H(s") = (0,1,-1,0,0,...), 



whicfi yields 



Since 



{s^-s",H(s^)-H(s'^)),^ = i^-7/^- (6.28) 



2 4' 



C s^-s' 



b ..||2 A 



we have that for all C and all e < ^ , 

for all sufficiently small /?, which proves Eq. (I6.26p . This completes the proof of the 
claim. n 

Claim [17] indicates that a direct proof of uniqueness of fluid solutions using the 
OSL property of the drift will not work for s(-). The uniqueness of s(-) should still 
hold, but the proof can potentially be much more difficult, requiring an examination 
of all points of discontinuity of H(-) on the domain S . 

We now give some intuition as for why the discontinuity in Claim [17] occurs for 
H(-), but not for F(-). The key difference lies in the expressions of the drifts due to 
central service tokens in two fluid models, namely, g{-) (Eq. (13. 4p ) for v(-) and g^{-) 
(Eq. ( K2^ ) for s(-). For g'(-), note that 

g-(s) = 0, if Sj > and Sj+i > 0, (6.31) 

and 5^|(s) = p- min{Asj,p} , if Sj > and Sj+i = 0. (6.32) 

In other words, the ith coordinate of s(t), Sj(t) receives no drift due to the central 
service tokens if there is a strictly positive fraction of queues in the system with 
at least i + 1 tasks, that is, if Sj+i(t) > (Eq. (16.311) ). However, as soon as Si+i(t) 
becomes zero, Sj(t) immediately receives a strictly positive amount of drift due to the 
central service tokens (Eq. (I6.32p ). as long as Asj(t) < p. Physically, since the central 
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server always targets the longest queues, this means that when Sj+i(t) becomes zero, 
the set of queues with exactly i tasks becomes the longest in the system, and begins 
to receive a positive amount of attention from the central server. Such a sudden 
change in the drift of Sj(t) as a result of Sj+i(t) hitting zero is a main cause of the 
failure of the OSL condition, and this can be observed in Eq. (16.301) as /3 ^ 0. In 
general, the type of discontinuities that was exploited in the proof of Claim [17] can 
happen at infinitely many points in S . The particular choices of A = and p = 1 
were non-essential, and were only chosen to simplify the calculations. 

We now turn to the expression for g{-), the drift of v(-) due to the central service 
tokens. We have that 

(7j(v) = p, whenever Vj > 0. (6.33) 

Note that the above-mentioned discontinuity in g^{-) is not present in g(-). This is not 
surprising: since Vj(t) - 'ZJLiSj{t), Vj(t) receives a constant amount of drift from the 
central service token as long as Vj(t) > 0, regardless of the values of Vj(t), j > i + 1. By 
adding up the coordinates Sj{-),j > i, to obtain Vj(-), we have effectively eliminated 
many of the drift discontinuities in s(-). This is a key reason for the one-sided 
Lipschitz continuity condition to hold for F(-). 

To illustrate this "smoothing" effect, consider again the examples of s" and s** in 
Eq. (16.251) . In terms of v, we have 

v'^ = (1 + a, a, 0, 0, . . .) and v^ = (1 + a + e + /3, a + e + /3, /?, 0, 0, . . .). (6.34) 

We then have 

F(v") = (-1, -1, 0,0,...) and F(v^) = (-1,-1, -1,0,0,...). (6.35) 

Combining Eqs. (16.341) and (16.351) . we have 

v^'-v'^ = (e + /3,e + /3,/3,0,0,...), 
and F(v^) - F(v") = (0,0,-1,0,0,...). 

This implies that for all C > 0, 

(v'^ - v^ F(v^) - F(v''))^, = 4^ - ^ 1^" - ^'l^ ^^-^^^ 

for all f3 > 0. Contrasting Eq. (I6.36P with Eq. (I6.30p . notice that the |e term is no 
longer present in the expression for the inner product, as a result of the additional 
"smoothness" of F(-). Therefore, unlike in the case of H(-), the OSL condition for 
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F does not break down at v'* and v*. 
0.2^ 
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Figure 6-1: Comparison between the V^[-] and S^[-] representations. 

The difference in drift patterns described above can also be observed in finite sys- 
tems. The two graphs in Figure 16.2 .11 display the same sample path of the embedded 
discrete-time Markov chain, in the representations of S^ and V^, respectively. Here 
N = 10000, p = 1, and A = 0.5, with an initial condition S^[0] = (1,0.1,0.1,0,0,...) 
(i.e., 100 queues contain 2 tasks and the rest of queues are empty). Notice that when 
S^[n] hits zero, S{^[ri] immediately receives an extra amount of downward drift. On 
the other hand, there is no change in drift for V]^[n] when V^[ri] hits zero. This is 
consistent with the previous analysis on the fluid models. 
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In summary, the difficulty of proving the uniqueness of fluid solutions is greatly 
reduced by choosing an appropriate state representation, v(-). The fact that such 
a simple (linear) transformation from s(-) to v(-) can create one-sided Lipschitz 
continuity and greatly simplify the analysis may be of independent interest. 

6.3 Convergence to Fluid Solution over a Finite 
Horizon 

We now prove Theorem [61 

Proof. (Theorem [6]) The proof follows from the sample-path tightness in Propo- 
sition [11] and the uniqueness of fluid limits from Theorem [H By assumption, the 
sequence of initial conditions V^^'^) converges to some v*^ € V , in probability. Since 
the space V is separable and complete under the | • |^ metric, by Skorohod's repre- 
sentation theorem, we can find a probability space (fioj^OiPo) on which VC^'^) -^ v° 
almost surely. By Proposition [TT] and Theorem [U for almost every u eQ, any subse- 
quence of V^(ci;,t) contains a further subsequence that converges to the unique fluid 
limit v(v°,t) uniformly on any compact interval [0,T]. Therefore for all T < oo. 



lim sup ||V^(a;,t) - v(v°,t)|| =0, P-almost surely, (6.37) 



which implies convergence in probability, and Eq. fl3.16p holds. D 

6.4 Convergence to the Invariant State v^ 

We will prove Theorem [5] in this section. We switch to the alternative state repre- 
sentation, s(t), where 

s.(t)=v,,i(t)-v,(t), V2>0, (6.38) 

to study the evolution of a fluid solution as t ^ oo. It turns out that a nice mono- 
tonicity property of the evolution of s(t) induced by the drift structure will help 
establish the convergence to the invariant state. We recall that So(t) = 1 for all t, 
and that for all points where v is differentiable, 

Si{t) = Vi{t) - Vi+i(t) = A(si_i - Si) - (1 -p){si - Si+i) - 5f,f (s), 
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for all i > 1, where g^is) = giiy) - gi+i{y). Throughout this section, we will use 
both representations v(t) and s(t) to refer to the same fluid solution, with their 
relationship specified in Eq. (16. 38 p . 

The approach we will be using is essentially a variant of the convergence proof 
given in [3]. The idea is to partition the space S into dominating classes, and 
show that {i) dominance in initial conditions is preserved by the fluid model, and 
(n) any solution s(t) to the fluid model with an initial condition that dominates or 
is dominated by the invariant state s^ converges to s^ as t ^ cx). Properties {€) and 
{ii) together imply the convergence of the fluid solution s{t) to s-^, as t ^ oo, for any 
finite initial condition. It turns out that such dominance in s is much stronger than 
a similarly defined relation for v. For this reason we do not use v but instead rely 
on s to establish the result. 

Definition 18. (Coordinate- wise Dominance) For any s,s' € 5 , we write 

1. s>s' if s,j > s', for all i>0. 

2. s > s' if s i^ s', s> s' and Sj > s^, for all i> 1 where s^ > 0. B 

The following lemma states that >-dominance in initial conditions is preserved 
by the fluid model. 

Lemma 19. Let s^(-) and s^(-) be two solutions to the fluid model such that s^(0) > 
s2(0). Thens^{t) >s2(t),Vt>0. 

Proof. By the continuous dependency of a fluid limit on its initial condition (Corol- 
lary[T5]), it suffices to verify that s^(t) > s'^{t), Vt > 0, whenever s^(0) > 8^(0) (strictly 
dominated initial conditions). 

Let ti be the first time when s^(t) and s^(t) become equal and are both positive 
at least one coordinate: 

ti = mi{t>0:s\ti)^s^{ti),sj{t) = s^{t)>0, for some 2 >l}, (6.39) 

If ti = oo, one of the following must be true: 

(1) s^(t) > s^(t), for all t > 0, in which case the claim holds. 

(2) s^(t') = s'^{t') at some t' < oo. By the uniqueness of solutions, s^(t) = s'^(t) for 
all t > t', in which case the claim also holds. 



^We need the condition s ^t s' in order to rule out the case where s = s' = 0. 
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Hence, we assume ti < oo. Let k be the smallest coordinate index such that s^(ti) 
and s^(ti) are equal at k, but differ on at least one of the two adjacent coordinates, 
k-1 and A; + 1 : 

/c = minjz > : s.^ti) = sf(ti) > 0, max {s\Mi) - s? (ti)} > o| (6.40) 

Since s^(ti) > s^(ti), at all regular points t <ti that are close enough to ti, 

4W - 4it) ^ x{sU - sLi) - (1 -p)(sLi - 4.i) - i9l{^') - 9i{^')), (6.41) 

where 

giis') - 9l{s') < 0-l{sLi>0} 

+ [{p- mm{p, Xsl})-{p- min{p, As^,})] • I {s^.^^ = O} 
= 0, (6.42) 

where the last equality comes from the fact that s^(t) = s^,(t) by the definition of k. 
Because s^(t) and s^(t) is a continuous function of t in every coordinate, we can find 
a time to < ^i such that s^,(to) > s^,(to) and 

slit)- slit) >0, (6.43) 

for all regular t€ (to, ti). Since sl{ti)-sl{ti) ^ sl{to)-sl{to)+ J^%^{sl{t)-sl{t))ds, 
this contradicts with the fact that s^(ti) = s^„(ti), and hence proves the claim. D 

We are now ready to prove Theorem [51 

Proof. (Theorem [5]) Let s(-), s"(-), and s'(-) be three fluid limits with initial con- 
ditions in iS°° such that s"(0) > s(0) > s^(0) and s''(0) > s^ > s'(0). By Lemma 
[T9| we must have s"(t) > s^ > s'(t) for all t > 0. Hence it suffices to show that 
limj^oo l|s"(t) - s^l^ = linii^oo ||s'(t) - s-'^f^ = 0. Recall, for any regular t > 0, 

Vi(t) = A(v,_i(t)-v,(t))-(l-p)(v,(t)-v,+i(t))-(?,(v(t)) 
= As,_i(t)-(l-p)si(t)-(7,(v(t)) 
_ (,_,)( As._,(t)-,.(v(t)) _^| ^^_^^^ 

Recall, from the expressions for sf in Theorem O that sf^^ > -jr^, V? > 0. From Eq. 
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(I6.44p and the fact that Sg = Sq = 1 , we have 

Vr(t) = (l-p)(^^^^^^-sKt))<(l-p)(s{-sKt)), (6.45) 

for all regular t > 0. To see why the above inequality holds, note that 



1 -p 1 -p 

whenever s"(t) > 0, and 



A-<„(v"W)_A-p^^,_ i^^^i 



^ -»■("'(*»- = s?W = 0, (6.47) 



1 - P 

whenever s"(t) = s( = 0. We argue that Eq. (16.451) implies that 



lim|s{-s^(t)| = 0. (6.48) 



£-*oo 



To see why this is true, let /ii(t) = s( - s"(t), and suppose instead that 

limsup|s(-s^(t)| = (5>0. (6.49) 



t->oo 



Because s"(t) > s^ for all t, this is equivalent to having 

liminf/ii(t) = -(5. (6.50) 



t-*-oo 



Since s(t) is a fluid limit and is L-Lipschitz-continuous along all coordinates, hi(t) is 
also L-Lipschitz-continuous. Therefore, we can find an increasing sequence {tk}k>i <= 
R+ with limfc.^00 tk = 00, such that for some 7 > and all k >1, 

h{t)<--S, Vte [4-7,4 + 7]. (6.51) 

Because vi(0) < 00 and hi(t) < for all t, it follows from Eqs. (I6.45P and (I6.5ip that 
there exists some To > such that 

v]*(t)= rvi"(s)ds< f {l-p)hi{s)ds<0, (6.52) 

Js=0 Js=0 

for all t >T, which clearly contradicts with the fact that Vi(t) > for all t. This 
shows that we must have limf^oo Is" (t) - s(| = 0. 
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We then proceed by induction. Suppose linif_»oo |s"(t) - s[| = for some i>l. By 
Eq. f l6.44p . we have 

S (l-p)W,i-s'Vi(<) + £?(<))■ (6.53) 

where e"(t) = y^ (s^(t) - s[) -> as t ^ oo by the induction hypothesis. With the 

same argument as the one for Si, we obtain hmt^oo |Si+i(0~^f+il - 0- This estabhshes 
the convergence of s"(t) to s-^ along all coordinates, which implies 

lim||s''(t)-s^|| =0. (6.54) 

Using the same set of arguments we can show that linii^oo ||s'(t) -s^||^ = 0. This 
completes the proof. D 

6.4.1 A Finite-support Property of v() and Its Implications 

In this section, we discuss a finite- support property of the fluid solution v(-). Al- 
though this property is not directly used in the proofs of other results in our work, 
we have decided to include it here because it provides important, and somewhat 
surprising, qualitative insights into the system dynamics. 

Proposition 20. Let v'^ e V , and let v(v°,-) be the unique solution to the fluid 
model with initial condition v(v°,0) - v°. If p > 0, then v(v°,t) has a finite support 
for all t > 0, in the sense that 

sup{i:v,(v°,t)>0}<oo, Vt>0. (6.55) 

Before presenting the proof, we observe that the finite-support property stated 
in Proposition [20] is independent of the size of the support of the initial condition 
v*^; even if all coordinates of v(t) are strictly positive at t = 0, the support of v(t) 
immediately "collapses" to a finite number for any t > 0. 

Note that a critical assumption in Proposition [20] is that p> 0, i.e., the system 
has a non-trivial central server. In some sense, the "collapse" of v(-) into a finite 
support is essentially due to the fact that the central server always allocates its 
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service capacity to the longest queues in the system. Proposition [20] illustrates that 
the worst-case queue-length in the system is well under control at all times, thanks 
to the power of the central server. 

Proposition |20] also sheds light on the structure of the invariant state of the fluid 
model, v^. Recall from Theorem [2] that v^ has a finite support whenever p> 0. Since 
by the global stabihty of fluid solutions (Theorem H]), we have that 

lim||v(t)-v^|| =0, (6.56) 

the fact that v(t) admits a finite support for any t > whenever p> provides strong 
intuition for and partially explains the finite-support property of v^. 
We now prove Proposition [201 

Proof. (Proposition I20p We fix some v° e V , and for the rest of the proof we will 
write v(-) in place of v(v'',-). It is not difficult to show, by directly inspecting the 
drift of the fluid model in Eq. (4), that if we start with an initial condition v*^ with 
a finite support, then the support remains finite at all times. Hence, we now assume 
V? > for all i. First, the fact that v° e V (i.e., v° < cx)) implies 

limv° = 0. (6.57) 

This is because all coordinates of the corresponding vector s*^ are non-negative, and 



V? 



v?-Es°, (6.58) 



where the second term on the right-hand side converges to v^*. 

Assume that Vj(t) > for all i, over some small time interval t € [0,s]. Since 
the magnitude of the drift on any coordinate Vj is uniformly bounded from above by 
A + 1, and limj_,oo v,° = 0, for any e > we can find s', A^ > such that for alH > A^ 
and t € [0,s'], 

Vi(t) = A(vi_i-Vi)-(l-p)(vi-Vi+i)-^i(v) <e-gi{-v) = -p + e. (6.59) 

Since linij^oo v° = 0, Eq. (16.591) shows that it is impossible to find any strictly pos- 
itive time interval [0,s] during which the fiuid trajectory v(t) maintains an infinite 
support. This proves the claim. D 
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Chapter 7 

Convergence of Steady-State 
Distributions 



We will prove Theorem [7] in this chapter, which states that, for all A^, the Markov 
process V^(t) converges to a unique steady-state distribution, tt^ , as t ^ oo, and 
that the sequence {vr^}Ar>i concentrates on the unique invariant state of the fluid 
model, v^, as A^ ^ oo. This result is of practical importance, as it guarantees that 
key quantities, such as the average queue length, derived from the expressions of v^ 
also serve as accurate approximations for that of an actual finite stochastic system 
in the long run. 

Note that by the end of this chapter, we will have established our steady-state 
approximation results, i.e., 

V^(t) *-^ TT^ "^ V^ (7.1) 

as was illustrated in Figure 13-31 of Chapter [31 Together with the transient approxi- 
mation results established in the previous chapters, these conclude the proofs of all 
approximation theorems in this thesis. 

Before proving Theorem [TJ we first give an important proposition which strength- 
ens the finite-horizon convergence result stated in Theorem |6l by showing a uniform 
speed of convergence over any compact set of initial conditions. This proposition 
will be critical to the proof of Theorem [7] which will appear later in the chapter. 
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7.1 Uniform Rate of Convergence to the Fluid 
Limit 

Let the probability space (f2i,^i,Pi) be the product space of (flwy^Wi^w) and 
(Qu,!Fu,Fu). Intuitively, (r2i,^i,Pi) captures all exogenous arrival and service in- 
formation. Fixing wi £ Qi and v° e V n Q^ , denote by V^(v°,a;i,t) the resulting 
sample path of V^ given the initial condition V^(0) = v°. Also, denote by v (v°,t) 
the solution to the fluid model for a given initial condition v°. We have the following 
proposition. 

Proposition 21. (Uniform Rate of Convergence to the Fluid Limit) Fix 

T>0 andM e N. Let K^ = V*^ n Q^ . We have 

lim sup rf^+(V^(v°,a;i,-),v(v°,-)) = 0, ¥i- almost surely, (7.2) 

where the metric d^+{-,-) was defined in Eq. (15. 5p . 

Proof. The proof highlights the convenience of the sample-path based approach. By 
the same argument as in Lemma [9l we can flnd sets Cw <= ^w and Cu c Vtu such that 
the convergence in Eqs. (15. ip and (15. 2p holds over Cw and Cu, respectively, and that 

^w{Cw) = ^uiCu) = 1- Let Ci=Cwx Cu- Note that Pi(Ci) = 1. 
To prove the claim, it suffices to show that 

lim sup d^+(V^(v°,cji,-),v(v°,-)) = 0, Vc^ieCi. (7.3) 

We start by assuming that the above convergence fails for some ui e Ci, which 
amounts to having a sequence of "bad" sample paths of V^ that are always a positive 
distance away from the corresponding fluid solution with the same initial condition, 
as A^ ^ oo. We then flnd nested subsequences within this sequence of bad sample 
paths, and construct two solutions to the fluid model with the same initial condition, 
contradicting the uniqueness of fluid model solutions. 
Assume that there exists (Di € Ci such that 

limsup sup rf^+(V^(v°,wi,-),v(vO,-))>0. (7.4) 

This implies that there exists e > 0, {Ni}°^^ c N, and {v(0'^')}^^^ with v(o.^») e 
K'^\ such that 
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for all i e N. We make the following two observations: 

1. The set V is closed and bounded, and the fluid solution v(v(°'^*),-) is L- 
Lipschitz-continuous for all i. Hence the sequence of functions {v(v(°'^*),-)}j^j^ 
are equicontinuous and uniformly bounded on [0,T]. We have by the Arzela- 
Ascoli theorem that there exists a subsequence {Nf}°^^ of {N^}'^-^ such that 

ci^.(v(v(0'^^),-),v'^(-))-0, (7.6) 

as « ^ cx), for some Lipschitz-continuous function v"(-) with v"(0) e V . By the 
continuous dependence of fluid solutions on initial conditions (Corollary [T5|) . 
v"(-) must be the unique solution to the fluid model with initial condition 
v"(0), i.e., 

v'^(t)=v(v"(0),t), Vt€[0,T]. (7.7) 

2. Since ui e Ci, by Propositions fTTl and [T2l there exists a further subsequence 
{Nf}Z^ of {N^}Zi such that V^f (v(o.^f),-) -y v''(-) uniformly over [0,T] as 
z ^ (x>, where v^(-) is a solution to the fluid model. Note that since {Nf}"^^ c 
{N^JZv *e have v^(0) = v'*(0). Hence, 

v''(t)=v(v'^(0),t), Vte[0,r]. (7.8) 

By the definition of cDi (Eq. (I7.4p ) and the fact that Ui € Ci, we must have 
suPte[o,T] l|v'^(0 ~'^^('t)\\w ^ ^' which, in light of Eqs. (17. 7p and (EH]), contradicts the 
uniqueness of the fluid limit (Theorem H]). This completes the proof. D 

The following corollary, stated in terms of convergence in probability, follows 
directly from Proposition [21] The proof is straightforward and is omitted. 

Corollary 22. Fix T > and M e N. Let K^ = V^^ n Q^ . Then, for all S>0, 

lim Pi (a;i € fii : sup (i^+ (V^ (v°,cji,-) , v(v°,-)) > ^^ I = 0. (7.9) 



7.2 Proof of Theorem [T] 

We first state a tightness result that will be needed in the proof of Theorem [7] 
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Proposition 23. For every N < oo and p e (0,1], V^(t) is positive- recurrent and 
V^(t) converges in distribution to a unique steady-state distribution tt^'P as t ^ oo. 
Furthermore, the sequence {tt^'P}^^-,^ is tight, in the sense that for all e > 0, there 
exists M > such that 



,N,,'^.M 



(V )=7r^'P(vf<M)>l-e, ViV>l. (7.10) 

Proof Sketch. The proposition is proved using a stochastic dominance argument, 
by coupling with the case p = 0. While the notation may seem heavy, the intuition is 
simple: when p = 0, the system degenerates into a collection of M/M/l queues with 
independent arrivals and departures (but possibly correlated initial queue lengths), 
and it is easy to show that the system is positive recurrent and the resulting sequence 
steady-state distributions is tight as N -^ oo. The bulk of the proof is to formally 
argue that when p > 0, the system behaves "no worse" than when p = in terms of 
positive recurrence and tightness of steady-state distributions. See App endix I A . 2 1 for 
a complete proof using this stochastic dominance approach. D 

Remark. It is worth mentioning that the tightness of tt^'P could alternatively 

—N 

be established by defining a Lyapunov function on V and checking its drift with 
respect to the underlying embedded-discrete-time Markov chain. By applying the 
Foster-Lyapunov stability criterion, one should be able to prove positive recurrence 
of V^ and give an explicit upper-bound on the expected value of Vj^ in steady stately 
If this expectation is bounded as N ^ oo, we will have obtained the desirable result 
by the Markov inequality. We do not pursue this direction in this thesis, because we 
believe that the stochastic dominance approach adopted here provides more insight 
by exploiting the monotonicity in p in the steady-state queue length distribution. 

Proof. (Theorem [7]) For the rest of the proof, since p is fixed, we will drop p in the 
super-script of tt^'P. By Proposition [23], the sequence of distributions vr^ is tight, 
in the sense that for any e > 0, there exists M(e) € N such that for all M > M(e), 

Tv^ (V*^ n Q^) > 1 - e, for all A^. 

The rest of the proof is based on a classical technique using continuous test 
functions (see Chapter 4 of [2]). The continuous dependence on initial conditions 
and the uniform rate of convergence established previously will be used here. Let C 
be the space of bounded continuous functions from V to M. Define the mappings 



^For an overview of the use of the Foster-Lyapunov criterion in proving stability in qucueing 
networks, see, e.g., [I4) . 
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T^{t) and T{t) on C by: 



(T^(t)/)(v°) ^ E[/(V^(t))|V^(0)=vO], 
and (T(t)/) (v°) ^ E[/(v(t))|v(0)=vO] = /(v(v°,t)), for/€C. 

With this notation, tt^ being a steady-state distribution for the Markov process 
V^(t) is equivalent to having for all t > 0, / e C, 






(7.11) 



Since {vr^} is tight, it is sequentially compact under the topology of weak conver- 
gence, by Prokhorov's theorem. Let vr be the weak limit of some subsequence of 
{tt^}. We will show that for all t > 0, / e C, 



I r _ T(t)/(v°)dyr(v°) - f _ /(v°)rfvr(v°) 



0. 



(7.12) 



In other words, vr is also a steady-state distribution for the deterministic fluid limit. 
Since by Theorem [21 the invariant state of the fluid limit is unique, Eq. (I7.12p will 
imply that 7r(v-'^) = 1, and this proves the theorem. 
To show Eq. (I7.12p . we write 



\fT{t)fd7i- T/dTrUlimsup [T{t)fdTx- f T(t)fd 

J T{t)fdiT^ - J T^t)fd 
fT^{t)fdn^-ffd7r 



.N 



+ lim sup 



+ lim sup 



(7.13) 



We will show that all three terms on the right-hand side of Eq. (I7.13P are zero. 
Since v(v'',t) depends continuously on the initial condition v*^ (Corollary [T5l) . we 
have T{t)f e C,yt> 0, which along with vr^ => tt implies that the flrst term is zero. 
For the third term, since n^ is the steady-state distribution of V^, we have that 
/ T^{t)fdTT^ = f fdTi^, Vt > 0, / e a Since vr^ ^ vr, this implies that the last term 
is zero. 



65 



—M 



To bound the second term, fix some M € N and \et K = V . We have 



hm sup 



< hm sup 
+ hm sup 

N-*oo 



fT{t)fdn^-fT^{t)fdn 

fTit)fd7T^- [ T^{t)fd7. 
J K J K 

f Tit)fdn''- f T^(t)M 
< hmsup r |r^(t)/-r(t)/|(i7r^ + hmsup2||/||7r^(J^'=) 



(b) 



hmsup2||/||7r^(ir^) 



N^oo 



(7.14) 



where K'^ = V - K and 



= sup^gy~ |/(v)|. The inequahty (a) holds because T{t) 
and T^{t) are both conditional expectations and are hence contraction mappings 
with respect to the sup-norm ||/||. Equality (6) (limsup^_^^ /^ \T^(t)f - T(t)f\ dir^ = 
0) can be shown using an argument involving interchanges of the order of integration, 
which essentially follows from the uniform rate of convergence to the fluid limit over 
the compact set K of initial conditions (Corollary 12^ . We isolate equality (b) in the 
following claim: 

— oo 

Claim 24. Let K be a compact subset ofV , we have 

limsup [ \T^{t)f-T{t)f\dn^ = (7.15) 

Proof. Fix any 6 > 0, there exists N{6) > such that for all A^ > N{6), we have 

\fT{t)fdTT^- f T^{t)fd7T^ < f \T{t)f-T^{t)f\dTV^ 

= r |/(v(v°, t)) - E [/ (V^(t)) |V^(0) = vO] |d7r^(v°) 

Jv"eK I I 



(a) 
< 



f 



sup 

v*eV~,||v*-v(vO,t)||„<a 



|/(v(vO,t))-/(v*)k7r^(vO) 



< ^f(K',6), 
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where K^ is the (^-extension of K, 

K'' = {x e V" : ||x - y 1^ < 5 for some y ^ K] , (7.16) 

and ojf{X, 5) is defined to be the modulus of continuity of / restricted to set X: 



A 



ujf{K,5)= sup |/(x)-/(y)|. (7.17) 

x,yeX,||x-y||„<5 

To see why inequahty (a) holds, recall that by Corollary [221 starting from a 
compact set of initial conditions, the sample paths of a finite system stay uniformly 
close to that of the fluid limit on a compact time interval with high probability. 
Inequality (a) then follows from Eq. (17. 9p and the fact that / is bounded. Because 
iT is a compact set, it is not difficult show that K^ is also compact for some fixed 
(^'^ > 0. Hence / is uniformly continuous on K^ , and we have 



lim sup 



fT{t)fd7r^- [ T^{t)fd 

JK JK 



TT^ 



<limsupa;/(A'^°,(5) = 0, (7.18) 

5->0 



which establishes the claim. D 

Going back, since Eq. (I7.14p holds for any K = V , M e N, we have, by the 
tightness of tt^ , that the middle term in Eq. (17.131) is also zero. This shows that any 
limit point n of {vr^} is indeed the unique invariant state of the fiuid model (v^). 
This completes the proof of Theorem [71 D 
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Chapter 8 

Conclusions and Future Work 



The overall objective of this thesis is to study how the degree of centralization in 
allocating computing or processing resources impacts performance. This investiga- 
tion was motivated by applications in server farms, cloud centers, as well as more 
general scheduling problems with communication constraints. Using a fluid model 
and associated convergence theorems, we showed that any small degree of central- 
ization induces an exponential performance improvement in the steady-state scaling 
of system delay, for sufficiently large systems. Simulations show good accuracy of 
the model even for moderately-sized finite systems (A^ = 100). 

There are several interesting and important questions which we did not address 
in this thesis. We have left out the question of what happens when the central server 
adopts a scheduling policy different from the Longest-Queue-First (LQF) policy con- 
sidered in this thesis. Since scheduling a task from a longest queue may require 
significant global communication overhead, other scheduling policies that require 
less global information may be of great practical interest. Some alternatives include 

1. {Random k- Long est- Queues) The central server always serves a task from a 
queue chosen uniformly at random among the k most loaded queues, where 
A; > 2 is a fixed integer. Note that the LQF policy is a sub-case, corresponding 
to k = l. 

2. {Random Work- Conserving) The central server always serves a task from a 
queue chosen uniformly at random among all non-empty queues. 

It will be interesting to see whether a similar exponential improvement in the delay 
scaling is still present under these other policies. Based on the analysis done in this 
thesis, as well as some heuristic calculations using the fiuid model, we conjecture that 
in order for the phase transition phenomenon to occur, a strictly positive fraction of 
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the central service tokens must be used to serve a longest queue. Hence, between the 
two policies listed above, the former is more likely to exhibit a similar delay scaling 
improvement than the latter. 

Assuming the LQF policy is used, another interesting question is whether a non- 
trivial delay scaling can be observed if p, instead of being fixed, is a function of A^ 
and decreases to zero as A^ ^ oo. This is again of practical relevance, because having 
a central server whose processing speed scales linearly with A^ may be expensive or 
infeasible for certain applications. To this end, we conjecture that the answer is 
negative, in that as long as limsupjv^ooP(^) = 0, the limiting delay scaling will be 
the same as if p{N) is fixed a.t p - 0, in which case vi ~ j^ as A -^ 1. 

On the modeling end, some of our current assumptions could be restrictive for 
practical applications. For example, the transmission delays between the local and 
central stations are assumed to be negligible compared to processing times; this may 
not be true for data centers that are separated by significant geographic distances. 
Also, the arrival and processing times are assumed to be Poisson, while in reality 
more general traffic distributions (e.g., heavy-tailed traffic) are observed. Finally, the 
speed of the central server may not be able to scale linearly in A^ for large A^. Further 
work to extend the current model by incorporating these realistic constraints could 
be of great interest, although obtaining theoretical characterizations seems quite 
challenging. 

Lastly, the surprisingly simple expressions in our results make it tempting to ask 
whether similar performance characterizations can be obtained for other stochas- 
tic systems with partially centralized control laws; insights obtained here may find 
applications beyond the realm of queueing theory. 
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Appendix A 

Appendix: Additional Proofs 



A.l Complete Proof of Proposition [TT 

Here we will follow a line of argument in [1] to establish the existence of a set of fluid 
limits. We begin with some definitions. Recall the uniform metric, d{-,-), defined on 
D[Q,T]: 

d{x,y)= sup \x(t)-y(t)\, x,y e D[0,T]. (A.l) 

te[0,T] 

Definition 25. Let Ef. he a non-empty compact subset o/D[0,T]. A sequence of 
subsets of D[0,T], £ - {E]^}j^^^, is said to be asymptotically close to the set Ec if 
the distance to Ec of any element in E^ decreases to zero uniformly, i.e.: 



lim suprf(x,£'c) = 0, (A.2) 

where the distance from a point to a set is defined as 

d{x,Ec)= mid{x,y). (A.3) 

yeEc 

Definition 26. A point y e D[0,r] is said to be a cluster point of a sequence 
{xn}n>1' ^f ^^■^ '~i- neighborhood is visited by {xatJ^^^^ infinitely often for all 7 > 0, 
i.e., 

liminfrf(xAr,?/) = 0. (A.4) 

A point y £ D[0,T] is a cluster point of a sequence of subsets £ = {Eiy}j^^^, if it is a 
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cluster point of some {xM}pf:,i ^'^^^ that: 

xn^En, VA^>1. (A.5) 

Lemma 27. Let C {£) be the set of cluster points of £ = {En}]^^^. If £ is asymp- 
totically close to a compact and closed set E^. then, 

1. £ is asymptotically close to C{£). 

2. C{£)cE,. 

Proof. Suppose that the first claim is false. Then there exists a subsequence {xNi}i^i, 
where xjy- e Ej^., Vz, such that 

d{xN^,C{£))=^>0, V2>1. (A.6) 

However, since £ is asymptotically close to E^. by assumption, there exists {yi} c E^ 
such that 

d{xN,,yi) ^0, asi^oo. (A.7) 

Since Ec is compact, {yi} has a convergent subsequence with limit y. By Eq. ( 1A.7I) . 
y is a cluster point of {xat.}, and hence a cluster point of £, contradicting Eq. ( 1A.6I) . 
This proves the first claim. 

The second claim is an easy consequence of the closedness of E^. Let x be any 
point in C {£). There exists a subsequence {xat.}, where xat. e Ejy-, Vi, such that 
\im.i^oo d{xNi,x) - 0, by the definition of a cluster point. By the same reasoning as 
the first part of the proof (Eq. ( 1A.7I) ). there exists a sequence {yi} c Ec which also 
converges to x. Since Ec is closed, x e Ec. D 

We now put the above definition into our context. Define £ = {En}j^^-^ to be a 
sequence of subsets of D[0,T] such that 

En = {x€D[0,r]:|x(0)-x°|<M^, and 

\x{a)-x{b)\<L\a-b\ + -fN, Va,6e[0,T]}, (A.8) 

where a;° is a constant, M^ i and 7Af i are two sequences of diminishing non- 
negative numbers. We first characterize the set of cluster points of the sequence £. 
Loosely speaking, £ represents a sequence of sample paths that tend increasingly 
"close" to the set of L-Lipschitz continuous functions, and that all elements of £ are 
"7Ar-approximate" Lipschitz-continuous. The definition below and the lemma that 
follows will formalize this notion. 
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Define Ec as the set of Lipschitz- continuous functions on [0,r] witli Lipscfiitz 
constant L and initial values bounded by a positive constant M, defined by: 

E^ = {xeD[0,T]:\x(0)\ <M, and \x{a)-x{b)\ < L\a- b\,ya,b e [0,T]} . (A.9) 

We liave tlie following characterization of Ec- 

Lemma 28. Ec is compact. 

Proof. Ec is a set of L-Lipschitz continuous functions x(-) on [0, T] with initial values 
contained in a closed and bounded interval. By the Arzela-Ascoli theorem, every 
sequence of points in Ec contains a further subsequence which converges to some 
x*{-) uniformly on [0,T]. Since all elements in Ec are L-Lipschitz continuous, x*(-) 
is also Lipschitz continuous on [0,T]. It is clear that x*(-) also satisfies a;*(0) < M. 
Hence, x*{-) e Ec. D 

Lemma 29. £ is asymptotically close to Ec. 

Proof. It suffices to show for all x e En, there exists some L-Lipschitz-continuous 
function y such that 

d{x,y)<C^M. (A.IO) 

where C is a fixed constant, independent of A^. Fixing x € D[0,T], such that 

\x (a) - X (b) \ < L\a - b\ + -f,^a,b e [0,T], (A.ll) 

we will use a truncation argument to construct an L-Lipschitz-continuous function 
y (t) that uniformly approximates x (t). For the rest of the proof, we use the short- 
hand [a ±7] to denote the closed interval [0-7,0 + 7]. The following two claims are 
useful: 

Claim 30. There exist y^ e [x (0) ± 7] and y-r e [x (T) ± 7] such that 

\yT-yo\<TL. (A.12) 

In particular, this implies that the linear interpolation between {0,yo) and {T,yj-) 



y{t) =yo+ ^ t (A.13) 



is L-Lipschitz-continuous. 
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Proof. Substituting a = 0, 6 = T in Eq. (lA.llI) . we get 

-LT--f<x(0)-x{T)<LT + j. (A.14) 

Write 

yo-yT = x{0)-x (T) + {yo-x (0)) -{yr-x (T)) . (A.15) 

The claim then follows from the above Eqs. (1A.14I) and ( ]A.15|) . by noting that 
(i/o ~ X (0)) - (yx - X (T)) can take any value between -27 and 27. D 

Claim 31. Given any two points yo € [x{0) ± 7] and yr e [x{T) ± 7] such that 
\yo - yrl ^ TL, there exists j/t e [x (|^) ± 7] such that 



2/0 - yz < -TT, and \yT - yx 
2 1 2 I 2 



TL 



Proof. Without loss of generality, assume that y^ < yx- We have, 



(A.16) 



\yo -z\> \yT -z\,yz> 



-, and \yo-z\ < \yT-z\,yz< 



(A. 17) 



By Claim [301 we can find ?/r, y^ € [x (|^) ± 7] such that 



2/0 - 2/t 



< — , and 



2/t -yr 



TL 



(A.18) 



By Eq. ( IA.17I) . at least one of ^/^ and y^ can be used as yr to satisfy Eq. (lA.16p . 
An identical argument applies if 2/0 ^ 2/t- CH 

Using Claim [311 we can repeat the same process to find yr given yo and yr, and 

4 2 

ysT given yx and yx- Proceeding recursively as such, at the A^th iteration we will 



have found a sequence \ ysr_ > , such that 






V2^j 



iT 

Xl TTTT I ±7 



and 



yiT_-y (ui)T 

2N „N 



LT 

nN • 



(A.19) 



Denote by y^ (t) the linear interpolation oi \y tT \ , we then have that for all 
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0<t< 2TV 



|y^(t)-x(t)| = |2/^(0)-x(0) + (y^(t)-y^(0))-(x(t)-x(0))| 



< k/^(0)-x(0) + 



/^(^)-2/^(0)| + |x(^)-x(0) 



< 7 + 



LT (LT \ 



(A.20) 



where the first two terms in the last inequality follow from Eq. flA.lQp . and the last 
term follows from Eq. (lA.lip . An identical bound on \y^ {t) -x{t)\ holds over all 



other intervals |^, -^^^i^ 



1 < i < N - 1. Since y'^ {i) is a piece- wise linear with 
magnitudes of the slopes no greater than L, we have constructed a sequence of L- 
Lipschitz-continuous functions such that 



sup \y^ {t)-x{t)\<2-i + 



0<t<T 



LT 



(A.21) 



The proof for the lemma is completed by letting C be any constant greater than 2, 
and pick the approximating function y to be any y^ for sufficiently large A^. D 

Finally, the following lemma states that all sample paths X^ i^r) with u e C 
belong to Ep^, with appropriately chosen {M7v}Ar>i and {7Ar}Af>i- 



Lemma 32. Suppose that there exists v' e V such that for all u e C 

||V^(w,0)-v°|| <Mm, 



(A.22) 



for some M^v i 0. Then for all u ^ C and 2 e Z+, there exist L > and sequences 
Mn i and 7Ar i such that 

Xf (a;,-)e^^, (A.23) 

where En is defined as in Eq. (jA.Sp . 



Proof. Intuitively, the lemma follows from the uniform convergence of scaled sample 
paths of the event process W^ {oj,t) to (1 + A) t (Lemma [9]), that jumps along any 
coordinate of the sample path as a magnitude of ■^, and that all coordinates of X^ 
are dominated by W in terms of the total number of jumps. 

Based on the previous coupling construction, each coordinate of A^,L^ and C^ 
are monotonically non-decreasing, with a positive jump at time t of magnitude -^ 



74 



only if there is a jump of same size at time t in W (u,-). Hence for alH > 1, 

|Af (a;,a)-Af (a;,6)|<|W^^(a;,a)-W^^(a;,6)|, \fa,be[0,T]. (A.24) 

The same inequahties hold for L^ and C^. Since by construction, 

Vf(a;,t)=Vf(a;,0)+Af(a;,t)-Lf(a;,t)-Cf(a;,t), Vz>l, (A.25) 

we have that for alH > 1, 

\Xf{u,a)-:S.f^{uj,b)\<3\W^{u,a)-W^{u,b)\, Va,fee[0,T]. (A.26) 

Since u eC, W^(a;,-) converges uniformly to (A+ l)t on [0,T] by Lemma [91 This 
implies that there exists a sequence 7Ar i such that for all A^ > 1, 

\W^{uj,a)-W^(uj,b)\<{X+l)\a-b\ + ^N, Va,6€[0,T], (A.27) 

which, in light of Eq. (IA.26p . implies 

|Xf (a;,a)-Xf (a;,6)|<3(A + l)|a-6| + 377v, Va,6 € [0,T],z > 1. (A.28) 

Finally, note that all coordinates of X^(a;,0) except for V^(a;,0) are equal to 
by definition. Proof is completed by setting M^ = 2'^M^, jjy = 3^n, and L = 
3(A + 1). D 

We are now ready to prove Proposition [TTl 

Proof. (Proposition 1111) Let us first summarize the key results we have so far: 

L (Lemma I2S]) Ec is a set of L-Lipschitz continuous functions with bounded values 
at 0, and it is compact and closed. 

2. (Lemma f2^ 8 = {En}]^-^i, a sequence of sets of 7Ar-approximate L-Lipschitz- 
continuous functions with convergent initial values, is asymptotically close Ec- 

3. (Lemma [32]) For all w e C, X^ {u, •) is in £. 

The rest is straightforward: Pick any u eC. By the above statements, for any z e Z+ 
one can find a subsequence {X^j (u,-)}"!-^ and a sequence {2/j}°^i <= E^ such that 

d(xf^(a;,-),%)^0, asj-oo. (A.29) 
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Since by Lemma 128) (statement 1 above), Ec is compact and closed, {yj}°°,i has a limit 

{N- 1 °° 

X,. -^ (cj, ■) ^ converges 

to y*. Moreover, since V^(a;,0) -> v^ and A^{u,0) = L^{u,0) = C^{uj,0) = 0, y*{0) 
is unique. This proves the existence of a L-Lipschitz-continuous limit point y*{-) at 
any single coordinate i of X^(-). 

With the coordinate-wise limit points, we then use a diagonal argument to con- 
struct the limit points of X^ in the D^+[0,T] space. Now let Vi{t) be any L- 

Lipschitz-continuous limit point of Vf', so that a subsequence V^^(a;,-) -^ Vi as 
J ^ oo in d{-,-). Then proceed recursively by letting Vi+i{t) be a limit point of a 

subsequence of < Y-_^\{u, ■) > , where {A'"'}"^^ are the indices for the ith subsequence. 

Finally, define 

Vi^Vi, VieZ+. (A.30) 

we claim that v is indeed a limit point of V^ in the (i^+(-, ■) norm. To see this, first 
note that for all A^, 

V^{uj,t)>Vf{u,t)>0, Vz>l,te[0,T]. (A.31) 

Since we constructed the limit point v by repeatedly selecting nested subsequences, 
this property extends to v, i.e., 

vi(t)>v,(t)>0, Vz>l,te[0,r]. (A.32) 

Since vi(0) = v'j' and vi(t) is L-Lipschitz-continuous, we have that 

sup |v,(t)| < sup |vi(t)| < |v?| + LT, \fi e Z+. (A.33) 

ie[0,T] te[0,T] 

Set A^i = 1, and let 

A^fc = min|A^>A^,_i:supd(Vf(a;,-),vO<^[, VA; > 2. (A.34) 

[^ l<i<k k, J 

Note that the construction of v implies N^ is well defined and finite for all k. From 



76 



Eq. f lAlaal) and Eq. ( 1X341) . we have for all fc > 2 



oo 

d^-(V^^-(a;,-),v) = sup A ^ 

te[0,T] \ i=0 



V^(a;,t)-v,(t)| 



^ - + ^ nv?KLT) ^ ;^ 



'^ \ i=k+l ^ 



Hence (i^+ (V^'=(a;, ■), v) ^ as A; ^ oo. The existence of the limit points a(t),l(t) 
and c(t) can be established by an identical argument. This completes the proof. D 

A. 2 Proof of Proposition [231 

Proof. (Proposition 1231) Fix A^ > and < p < 1. For the rest of the proof, 
denote by V^'Po(t) the sample path of V^(t) when p = po. Let {V^'P[n]}„>o be the 
discrete-time embedded Markov chain for V^'P(t), defined as 

V^.p[n]=V^'P(t„,), n>0 (A.36) 

where tn,n>l is defined previously as the time for the nth event taking place in the 
system (i.e., the nth jump in W^{-)), with the convention that to = 0. 

Definition 33. (Stochastic Dominance) Let {X[n]}n>o and {Y[n]}n>o be two 
discrete-time stochastic processes taking values in ]R^+. We say that {X[n]}n>o is 
stochastically dominated by {Y[n]}n>o, denoted by {X[n]}n>o -st {y[n]}n>o, if there 
exist random processes {X'[n]}n>o and {Y'[n]}n>o defined on a common probability 
space {fi,!F,F), such that 

1. X' and Y' have the same distributions as X and Y , respectively. 

2. X'[n] < Y'[n], Vn > 0, f -almost surely. 

We have the following lemma: 
Lemma 34. Fzx anyp e (0, 1]. //V^'P[0] = V^'0[0], t/ien {Vf'^[n]}„>o <st {Vf'°[n]}„>o. 

Proof. We will first interpret the system with p > as that of an optimal scheduling 
policy with a time-varying channel. The result will then follow from the classical 
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result in Theorem 3 in [12], with a shghtly modified arrival assumption, but almost 
identical proof steps. Recall the Secondary Motivation described in Section I1.1.2I 
Here we will use a similar but modified interpretation: instead of thinking of the 
central server as deciding between serving a most-loaded station versus servicing a 
random station, imagine that the central server always serves a most-loaded station 
among the ones that are connected to it. The channel between the central server 
and local stations, represented by a set of connected stations, evolves according to 
the following dynamics and is independent across different time slots: 

1. With probability p, all A^ stations are connected to the central server. 

2. Otherwise, only one station, chosen uniformly at random from the A^ stations, 
is connected to the central server. 

It is easy to see that, under the above channel dynamics, a system in which a central 
server always serves a most-loaded stations among connected stations will produce 
the same distribution for V^'P[n] as our original system. For the case p = 0, it is 
equivalent to scheduling tasks under the same channel condition just described, but 
with a server that servers a station chosen uniformly at random among all connected 
stations. The advantage of the above interpretation is that it allows us to treat 
V^'P[r7,] and V^'°[n] as the resulting aggregate queue length processes by applying 
two different scheduling policies to the same arrival, token generation, and channel 
processes. In particular, V^^ '^[n] corresponds to the resulted normalized total queue 
length process (V^'P = ;^ Ei=i Qi(^n)), when a longest-queue-first policy is applied, 
and V^ ' [n] corresponds to the normalized total queue length process, when a fully 
random scheduling policy is applied. Theorem 3 of [12] states that when the arrival 
and channel processes are symmetric with respect to the identities of stations, the 
total queue length process under a longest-queue-first policy is stochastically dom- 
inated by all other causal policies (i.e., policies that use only information from the 
past). Since the arrival and channel processes are symmetric in our case, and a ran- 
dom scheduling policy falls under the category of causal policies, the statement of 
Theorem 3 of [12] implies the validity of our claim. 

There is, however, a minor difference in the assumptions of Theorem 3 of [12] and 
our setup that we note here. In [12], it is possible that both arrivals and service occur 
during the same slot, while in our case, each event corresponds either to the an arrival 
to a queue or the generation of a service token, but not both. This technical difference 
can be overcome by discussing separately, whether the current slot corresponds to 
an arrival or a service. The structure of the proof for Theorem 3 in [12] remains 
unchanged after this modification, and is hence not repeated here. D 
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Using the discrete-time stochastic dominance result in Lemma IHU we can now 
estabhsh a similar dominance for the continuous-time processes Vj^ '^{t) and V^ ' (t). 
Since {V^ '^['^Dnao ^st {V^ ' [n]}„>o, by the definition of stochastic dominance, 
we can construct {V^ '^['^Dnao and {V-^ ' [n]}„>o on a common probability space 
(^d,^d:^d), such that V^ '^[n] < V^ ' [n] for all n > 0, P^-almost surely. Recall from 
previous sections that W^{t), the A^th event process, is a Poisson jump process with 
rate A^(l + A), defined on the probability space (flw,^Wi^w)- Let {flc^c^c) be 
the product space of {Qd,^d,^d) and (Qw,^w,^w)- Define two continuous-time 
random processes on (Qc,^c,^c) by 

V^,p(t) = \N,p [NW^{t)] , (A.37) 

and V^'O(t) = V^'O [NW^{t)] . (A.38) 

Note that NW^{t) is a Poisson jump process, and hence NW^{t) e Z+ for all 
< t < oo Pc- almost surely. Since V^ '^[n] < V^ ' [n] for all n almost surely by 
construction, this implies 

Vf'^(t) < Vf'°(t), Vt > 0, almost surely. (A.39) 

Recall the processes {V^'P[?7,]}„>o and {V^'°[n]}„>o were defined to be the embed- 
ded discrete-time processes for V^'P(t) and \^'^{t). It is also easy to check that the 
continuous-time Markov process V^'P(t) is uniform for all A^ and p (i.e., the rate 
until next event is uniform at all states). Hence, the processes V^'P(t) and \^'^{t) 
constructed above have the same distributions as the original processes V^'P(t) and 
V^''^(t), respectively. Therefore, as we work with the processes \^'P(t) and V^''^(t) 
in the rest of the proof, it is understood that any statement regarding the distribu- 
tions of V^'P(t) and V^'O(t) automatically holds for V^'P(t) and V^'O(t), and vice 
versa. 

We first look at the behavior of V^'°(t). When p = 0, only local service tokens 
are generated. Hence, it is easy to see that the system degenerates into A^ individual 
M/M/l queues with independent and identical statistics for arrivals and service 
token generation. In particular, for any station i, the arrival follows an Poisson 
process of rate A and the generation of service tokens follows a Poisson process of 
rate 1. Since A < 1, it is not difficult to verify that the process V^''^(t) is positive 
recurrent, and it admits a unique steady-state distribution, denoted by 7r^'°, which 
satisfies: 

7r^'°(Vi<x)=p( — f]^i<x|, VxeM, (A.40) 
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where {Ei}^^^ is a set of i.i.d. geometrically distributed random variables, with 

F{E, = k) = X''{l-X), VA;eZ+, (A.41) 

We now argue that V^'P(t) is also positive-recurrent for all p € (0, 1]. Let 

tP = inf {t > : Vf '^(t) = 0, and Vf '^(s) ^ for some < s < t} (A.42) 

In other words, if the system starts empty (i.e., V^ '^(0) = 0), r^ is the first time 
that the system becomes empty again after having visited some non-empty state. 
Since the process V^ '^(t) can be easily verified to be irreducible (i.e., all states 
communicate) for all p £ (0, 1], V^ '^(t) is positive-recurrent if and only if 

E[rP|vf'P(0) = 0]<oo. (A.43) 

Since V^'P(t) < V^'O(t), Vt > almost surely, it implies that tp < r" almost 
surely. From the positive recurrence of V^'°(t), we have 

E [rP|vf 'P(O) = O] < E [r°|vf '°(0) = O] < oo. (A.44) 

This establishes that V^'P(t) is positive-recurrent for all p e (0,1]. 

To complete the proof, we need the following standard result from the theory of 
Markov processes (see, e.g., [T3]). 

Lemma 35. If X{t) is an irreducible and positive recurrent Markov process taking 
values in a countable set I, then there exists a unique steady- state distribution n such 
that for any initial distribution of X{0), 

\imF{X{t) = i) = 7i{i), Vi e T. (A.45) 

By the positive recurrence of V^'P(t) and Lemma [351 'we have that V^'P(t) con- 
verges in distribution to a unique steady-state distribution tt^'P as t ^ oo. Combining 
this with the dominance relation in Eq. (IA.37p . we have that for any initial distribu- 
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tionofV^'P(O), 



7r^'P(V ) = 7r~'P(Vf<Af) 

= lim P (y^^P(t) < M) (by Lemma | 

> limP(V^'0(t) < M) (by Eq. (jX37])) 

= 7r^'°(Vf<Af) (by Lemma ES]) 

= P(^E^*^^) (byEq. (lElQD) (A.46) 

Since the i?jS are i.i.d. geometric random variables, by Markov's inequality, 

^^,P(V*') > 1 -p(l f i?, > AfU 1 - ^^ = 1 - — ^, (A.47) 
^ ^- \A^S / ^/ (l-A)M' ^ ^ 

for all M > E(i?i) = j^, which establishes the tightness of {tt^'P}^^^. This completes 
the proof of Proposition [231 Q 



Appendix B 

Appendix: Simulation Setup 



The simulation results shown in Figure |3^ for a finite system with 100 stations were 
obtained by simulating the embedded discrete-time Markov chain, {(5[n]}„£N, where 
the vector Q[n] e Z+^°° records the queue lengths of all 100 queues at time step n. 
Specifically, we start with Q[l] = 0, and, during each time step, one of the following 
takes place: 

1. With probability j^, a queue is chosen uniformly at random from all queues, 
and one new task is added to this queue. This corresponds to an arrival to the 
system. 

2. With probability j^, a queue is chosen uniformly at random from all queues, 
and one task is removed from the queue if the queue is non-empty. If the 
chosen queue is empty, no change is made to the queue length vector. This 
corresponds to the generation of a local service token. 

3. With probability -^, a queue is chosen uniformly at random from the longest 
queues, and one task is removed from the chosen queue if the queue is non- 
empty. If all queues are empty, no change is made to the queue length vector. 
This corresponds to the generation of a central service token. 

To make the connection between the above discrete-time Markov chain Q[n] and the 
continuous-time Markov process Q{t) considered in this thesis, one can show that 
Q{t) is uniformized and hence the steady-state distribution of Q{t) coincides with 
that of the embedded discrete-time chain (5[n]. 

To measure the steady-state queue length distribution seen by a typical task, we 
sampled from the chain Q[n] in the following fashion: Q[n] was first run for a burn- 
in period of 1, 000, 000 time steps, after which 500, 000 samples were collected with 20 
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time steps between adjacent samples, where each sample recorded the current length 
of a queue chosen uniformly at random from all queues. Denote by S the set of all 
samples. The average queue length, as marked by the symbol "x" in Figure [321 was 
computed by taking the average over S. The upper (UE) and lower (LE) ends of the 
95% confidence intervals were computed by: 

UE = min{x e S : there are no more than 2.5% 

of the elements of S that are strictly greater than x}, 
LE = max{x € S : there are no more than 2.5% 

of the elements of S that are strictly less than x}. 

Note that this notion of confidence interval is meant to capture the concentration 
of S around the mean, and is somewhat different from that used in the statistics 
literature for parameter estimation. 

A separate version of the above experiment was run for each value of A marked 
in Figure l3-2[ while the the level of centralization p was fixed at 0.05 across all 
experiments. 
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